visual browsers for biological models, simulation, and...

35
Visual Browsers for Biological Models, Simulation, and Ontologies Gary Yngve and Jim Brinkley and Dan Cook and Linda Shapiro March 4, 2008 Abstract We present a browser that we have developed for the purpose of exploring, understanding, and modifying ontologies in biology and medicine. The browser is an ontology viewer and editor, initially intended for the Foundational Model of Anatomy (FMA), but ap- plicable to other ontologies as well. This browser has two contributions. First, it is a lightweight deliver- able that lets someone easily dabble with an ontology as large as the FMA. Second, it lets the user edit an ontology to create a view of it. We also conduct user studies to refine and evaluate the software. 1 Introduction Data acquisition and the abilities to store and pro- cess such data are progressing so fast in biology and medicine that scientists, doctors, and healthcare workers are struggling to keep pace, both in terms of being overwhelmed by the mountains of data and in being unaware of how truly powerful the new technol- ogy can be. Having a controlled vocabulary means that data so annotated can be universally shared. The challenges are in developing the controlled vo- cabulary and getting it adopted by users. The latter challenge is two-fold: people want the gains of a con- trolled vocabulary without the expense of converting their legacy data to it, and controlled vocabularies are often too cumbersome and not customized enough for the average user. The computational aspects of biological research today intertwine models, data, and knowledge. An example is the work by Kalet et al.[23] on modeling the spread of cancer. Using knowledge of the connec- tivity of the lymphatic system and a model of how the cancer would spread, they can produce data that predicts the spread of the cancer. Furthermore, their results could be validated against clinical data. Go- ing one step further, they could use their knowledge of anatomy, their predictions of tumor spread, and a model of the effects of radiation treatment to plan how to target the radiation. The goal would be to maximize damage to cancerous cells while minimiz- ing damage to critical areas. As biologists are not ex- pected to be experts in the allied fields of math and computer science, there is a cognitive gap between their understanding of data, models, and knowledge and their computational representations. Reducing this gap would enable biologists to be more produc- tive. 1.1 Motivation An ontology consists of a controlled vocabulary and a set of formal relationships and rules between its terms. They are becoming increasingly popular in the corporate world, with companies such as Ontolica, Ontopia, and SchemaLogic marketing knowledge- base solutions. Ontologies for biology let researchers communicate their models and data with each other, no matter their language or discipline. Furthermore, computers can understand the formalities of an ontol- ogy and make deductions. With respect to simulation of the virtual human discussed previously, variables in the model, clinical data, and experimental results can all be catalogued using an ontology. A computer could use the knowledge to generate a model or code. Outside of simulation, being able to catalog an im- mense amount of uniformly annotated data from dis- tributed sources is immensely valuable. Annotated 1

Upload: others

Post on 12-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Visual Browsers for Biological Models, Simulation, and Ontologies

Gary Yngve and Jim Brinkley and Dan Cook and Linda Shapiro

March 4, 2008

Abstract

We present a browser that we have developed for thepurpose of exploring, understanding, and modifyingontologies in biology and medicine. The browser isan ontology viewer and editor, initially intended forthe Foundational Model of Anatomy (FMA), but ap-plicable to other ontologies as well. This browser hastwo contributions. First, it is a lightweight deliver-able that lets someone easily dabble with an ontologyas large as the FMA. Second, it lets the user edit anontology to create a view of it. We also conduct userstudies to refine and evaluate the software.

1 Introduction

Data acquisition and the abilities to store and pro-cess such data are progressing so fast in biologyand medicine that scientists, doctors, and healthcareworkers are struggling to keep pace, both in terms ofbeing overwhelmed by the mountains of data and inbeing unaware of how truly powerful the new technol-ogy can be. Having a controlled vocabulary meansthat data so annotated can be universally shared.The challenges are in developing the controlled vo-cabulary and getting it adopted by users. The latterchallenge is two-fold: people want the gains of a con-trolled vocabulary without the expense of convertingtheir legacy data to it, and controlled vocabulariesare often too cumbersome and not customized enoughfor the average user.

The computational aspects of biological researchtoday intertwine models, data, and knowledge. Anexample is the work by Kalet et al.[23] on modelingthe spread of cancer. Using knowledge of the connec-

tivity of the lymphatic system and a model of howthe cancer would spread, they can produce data thatpredicts the spread of the cancer. Furthermore, theirresults could be validated against clinical data. Go-ing one step further, they could use their knowledgeof anatomy, their predictions of tumor spread, anda model of the effects of radiation treatment to planhow to target the radiation. The goal would be tomaximize damage to cancerous cells while minimiz-ing damage to critical areas. As biologists are not ex-pected to be experts in the allied fields of math andcomputer science, there is a cognitive gap betweentheir understanding of data, models, and knowledgeand their computational representations. Reducingthis gap would enable biologists to be more produc-tive.

1.1 Motivation

An ontology consists of a controlled vocabulary anda set of formal relationships and rules between itsterms. They are becoming increasingly popular in thecorporate world, with companies such as Ontolica,Ontopia, and SchemaLogic marketing knowledge-base solutions. Ontologies for biology let researcherscommunicate their models and data with each other,no matter their language or discipline. Furthermore,computers can understand the formalities of an ontol-ogy and make deductions. With respect to simulationof the virtual human discussed previously, variablesin the model, clinical data, and experimental resultscan all be catalogued using an ontology. A computercould use the knowledge to generate a model or code.Outside of simulation, being able to catalog an im-mense amount of uniformly annotated data from dis-tributed sources is immensely valuable. Annotated

1

Page 2: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

radiological images could be used for educating youngradiologists. Doctors could query a huge database,asking for patient histories for say, patients with coloncancer with a certain cell type and stage, who are alsoin this gender and age group and have the followingchronic conditions. Without being able to assimi-late data from so many sources, people would not beable to amass enough to have a chance of extract-ing the similar cases they need. Researchers usingdata-mining and machine-learning techniques wouldappreciate such a pool of data, as more data meansbetter results. Finally, an individual patient’s entiremedical records, including imagery, could be easilyshared and distributed between people who need it.

An ontology is a controlled vocabulary of termswith formal relationships between the terms. Eachterm can be composed of other terms, and the rulesgoverning the relationships can be defined in the on-tology as well. Often the terms will have a taxonomythat allows traits to be inherited. At a minimum, anontology can be represented crudely in the resourcedescription framework (RDF) as subject-predicate-object triples, or with full rules and inferences in theweb ontology language (OWL). An ontology enablesthe data it describes by supporting reasoning anddata-mining, as well as standardizing nomenclaturesacross the world. A fundamental challenge is to com-bine domain-based ontologies into an overarching se-mantic web. An alternate, and likely easier, approachis to build foundational ontologies from which peoplecan extract application ontologies for their specificdomains.

An application ontology is derived from one ormore foundational ontologies, with the ontologiespredominantly orthogonal to each other. The ap-plication ontology would only contain a small sub-set of relevant information, both for efficiency andso as not to overwhelm the user in the subdomain.Furthermore, the application ontology may containsubdomain-specific references not present in the foun-dational ontologies, or it may contain an extra frame-work to tie the foundational ontologies together. Forthis article, application ontologies deriving from onlya single foundational ontology were considered, toavoid name-collision problems. The first researchproblem is designing the infrastructure to support

queries and edits of the application ontologies. Theinfrastructure should represent the application as aview layer; that is, the new ontology is not necessar-ily materialized. Rather queries to the applicationontology undergo a transformation, and data fromall of its sources are assimilated together to producethe results, as if the application ontology were ma-terialized. This construction makes it easy to chainlayers of application ontology views on top of eachother. The second research problem is to develop aneffective visualization for creating such an applicationontology view. The visualization should give interac-tive feedback, despite the challenge of operating onontologies too large to handle all at once. The visual-ization should be accompanied with powerful searchand query features and possess features expected ofany commercial application, such as robust saving,loading, undoing, and help. The ontology browsereffectively visualizes ontologies and enables users toconstruct ontology views.

Another roadblock is that individual labs or de-partments likely have their own legacy formats andnotations, and for reasons of convenience, they do notwant to spend the effort to standardize. To spur theirefforts, ontology researchers should both 1) try toreach out to them and explain how powerful and use-ful the ontologies are, and 2) try to make the adoptionof an ontology as painless as possible. To communi-cate the power of ontologies to a biologist or doctor,one must know how the biologist or doctor could useit. Fundamentally, this dilemma is a chicken-and-eggproblem. Until they possess the technology and finduses for it, informaticists do not know how it is goingto be used; likewise they are not going to waste thetime to adopt new technology if there is not a use forit. The ontology browser presented in this disserta-tion specifically eliminates the inconveniences associ-ated with using earlier deliveries of the FoundationalModel of Anatomy.

1.2 Contributions

The ontology browser stands apart from related workby scaling well with respect to the size of the ontology.In addition to being a browser, it also is an editor forcreating views of an ontology. The browser manages

2

Page 3: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

what has been loaded from the back end, what theuser has looked at, and what the user should see givenwhat has been loaded and looked at. The result forthe user is a seamless experience that preserves con-text and has maximum relevance. The browser alsoincludes powerful search and query tools, as well asinteractive tutorials. The ontology browser has addi-tionally undergone several iterations of feedback andimprovements, including a qualitative user study.

In addition to the user interface, the ontologybrowser has a compact representation for ontologiesthat can offer in excess of a 90% reduction in sizefor the underlying data. This compression allows on-tologies to be delivered as a small payload withoutadditional support software such as databases. Theeffect on the user is that the barrier of entry to ex-ploring an ontology is greatly reduced. The belief isthat the easier access will encourage the biomedicalcommunity to incorporate ontologies into their work,which will later reap great benefits.

A substantial contribution beyond the research andcreation of the browser is the engineering necessaryto elevate it to the level of commercial software, bothin feature sets and robustness. The ontology browseris planned to be adopted as a means for lightweightdelivery for other ontologies served from the NationalCenter for Biomedical Ontology1.

1.3 Synopsis

The ontology browser consists of an underlying ar-chitecture to support the notion of chained compactview layers and a visualization on top. The visualiza-tion supports quick navigation through the ontology,as well as defining of the views by editing the on-tology. It is supplemented with powerful search andquery features. The browser has been tested on ause case and is currently undergoing usability evalu-ations. In addition, a compressed representation ofthe Foundational Model of Anatomy has been devel-oped; it enables the ontology plus the software to bedelivered in total as a small payload. The softwareruns under Java 1.5, making it both lightweight andplatform-independent.

1http://www.bioontology.org

In the rest of this article, the relevant related workin the areas of ontologies and visualization is first pre-sented. Next the Before the ontology browser is de-scribed, the theory and architecture beneath it, whichare essential to understanding the design choices inthe interface and how it works, are explained. Finallyconclusions and future work are discussed.

2 Related Work

WRITE SOMETHING HERE

2.1 Ontologies

The Foundational Model of Anatomy[38] set thestandard as a bioinformatics ontology of canonicalanatomy that allows symbolic representation and rea-soning over a wide range of relationships. The prin-cipal component is the anatomical taxonomy, whichattempts to use standard nomenclature while adher-ing to rigorous definitions and relationships. The au-thors took much care in defining such nonleaf termsas organs and tissues but noted that inconsistenciesare nearly unavoidable when assigning definitions andlogic, for example with embryonic anatomy. Anothermain component is the structural abstraction, usedfor describing spatial and part-of relationships. Aless realized component is the transformational ab-straction. The FMA includes nonmaterial geometricabstractions and supports the creation of anatomicalsets, including sets defined by functionality. In thiswork the FMA will be called a foundational ontology,and the ontologies based on the FMA for subdomainswill be called application ontologies, though in morephilosophical literature, the FMA is considered anapplication ontology.

The potential value of the FMA in medicine isendless, though many research and engineering chal-lenges exist to get the FMA and allied technologiesin widespread use. One such hurdle is that many labsare handcuffed to their own knowledge bases, and thealignment of their knowledge bases with the FMA canbe a difficult or time-consuming task. The alignmentsoftware PROMPT[33], which runs inside Protege, 2

2http://www.protege.org, a free open-source ontology edi-

3

Page 4: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

assists the user in comparing two ontologies and re-solving conflicts. Perrin[36] created a visualizationplugin for PROMPT using tree maps and conducteda user study where a small population unanimouslyagreed that Prompt-Vis led to more enjoyable andeffective experience. Another hurdle is to create anapplication ontology for a subdomain, which mightnot only be a subset of the foundational ontology,but could also incorporate additional data or modifi-cations.

Lambrix et al.[29] conducted a study of ontologydevelopment tools using a subset of the Gene Ontol-ogy (GO)3. GO is very different from the FMA inseveral respects; it is continuously updated as newresearch happens, rather than being curated with oc-casional releases, and it has very few relations—justsubclass and part—which are nowhere near as topo-logically complex as in the FMA. They noted thatalmost all the systems they studied had shortcom-ings with scalability. In general, they found all thetools they studied to perform well, though they allhad learnability issues. Protege had a clear advan-tage for being so extendible with both plugins andformat import/exports.

The CLOVE framework[48] is a start toward cre-ating an application view from a foundational ontol-ogy. The authors defined a language that specifiesinclusions or exclusions based on simple constraints.There is still much work to do with creating a richerlanguage for views, not even including the possibili-ties of adding or modifying content or drawing knowl-edge from multiple ontologies. This research area islikely to be very active in the next few years.

Bernstein et al.[10] developed a natural languagesystem for editing ontologies. Their system promptsthe user with possible grammatically correct comple-tions, which addresses both the habitability problem(users expect a limited set of features that the ca-pabilities of the system far exceeds) and the ambi-guity problem (by restricting the language to a con-trolled grammar). They conducted studies showingtheir system is effective, even in the hands of novices,though the jargon associated with ontologies was a

tor and knowledge-base framework3http://www.geneontology.org

difficulty. As their system supports a relatively sim-ple set of edits, they note that they cannot comparetheir system fairly to Protege—rather they wouldneed a simplified version of Protege to see if the nat-ural language truly has an advantage.

2.2 Ontology visualization

Interest in ontologies and the semantic web has grownrecently, and with that, the need for visualization.Many graph-based visualization techniques have ap-peared, using tree-maps, force-directed layouts, 3-Dnavigation, etc. An outstanding survey paper onthese techniques[24] attempted to classify them ontheir functionality and usability for various task do-mains. They note that large ontologies (on the orderof 100000 classes/instances) are especially challeng-ing to visualize for several reasons. The size of theontologies requires specialized data structures andgraphics. For most visualizations attempting to dis-play that many items on a screen simultaneously, thelabels are relatively unimportant, However, for on-tologies, the labels are very much important, and itis hard, if at all possible, to have labels be nonover-lapping and legible when the view is cluttered.

A collaboration by many of the same authorshad previously conducted a usability study onfour techniques[25], including Jambalaya[43] andTGVizTab[4], both of which are available as plug-ins to Protege. They defined several general tasks forusers to do with the tools to assess the performanceof the tools. The ontology they used was small (afew hundred classes, instances, and slots), which is ofnote, because in my attempts to use the visualizationtools on the FMA (hundreds of thousands of classesand instances), they were either too slow (by a fac-tor of 1000) or crashed because they required all thedata to be in memory. Another point of discussion isthat a particular domain-specific task or a particularknowedge base may be more amenable to a particu-lar tool. This dilemma is a frustrating chicken-and-egg problem where the computer scientist does notknow how to best design the tool without knowinghow biologists will use it, and the biologists will notuse a tool unless it is usable. A further frustrationis users’ habituation to existing interfaces, exhibited

4

Page 5: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

by the fact that the textual class browser, defaultin Protege, performed the best in their experiments,which they theorized is due to users’ prior familiaritywith filesystem navigation.

Jambalaya[43] is a well featured plugin for Protegethat visualizes ontologies. It has a zoomable node-link interface with a variety of layouts and supportsdrag-and-drop from the default class browser. One ofits features is that it can nest nodes inside of nodesaccording to the subclass hierarchy. The relationsdisplayed as colored curved edges can be filtered towhat the user wants to see. As the user navigates,transitions are smooth, including a full zoom into anode producing a class browser view.

DIaMOND[16] is a degree-of-importance model forJambalaya. It is described as attention-reactive, be-cause the visibility of the nodes is dependent on whatthe user focuses his or her attention. Items can belabeled as landmark, interesting, or noninteresting.Landmark items are those selected as truly impor-tant and never to be hidden. Noninteresting itemsare either items specifically designated as noninter-esting or items whose importance threshold has de-cayed below a threshold. Interesting items are itemsthat have been accessed or otherwise indicated as in-teresting, and have not had their importances decayyet below a threshold.

TGVizTab[4] uses a mass-spring system to solvefor the layout using forces. It has many features incommon with Jambalaya, but of importance is thatchildren of a hierarchy may not all appear at thesame level of depth. In the study of Katifori et al.[25] many users found the layout to be choppy andchaotic, though despite the frustrations, the usersperformed very well with the tool.

2.3 Relevance to the Browser

Many other researchers have built ontology visu-alization tools; most are summarized in Katifori’ssurvey[24]. Of these tools, Jambalaya[43] andTGVizTab[4], both plugins to Protege, performedthe most successfully[25]. Protege, an impressiveand successful authoring tool, can also function asa browser, though as browsing is not its primaryfunction, it is not optimized for browsing. The

ontology browser’s choice of relevant nodes to dis-play is a simplification of the degree-of-importancepresent in DIaMOND[16]; the browser could be easilyextended to support their full degree-of-importancemodel. The ontology browser uses an extension ofYee et al.’s radial layout algorithm[51] designed tohandle cycles.

3 Ontology Browser Founda-

tions

Ontologies are large collections of terms, relation-ships between them, and rules for reasoning aboutthe contained knowledge. The ontology browser relieson a substantial backend to manage the data loadedfrom the ontology and any modifications to the ontol-ogy. Furthermore, the constraints on how an ontol-ogy can be changed dictate the interactions grantedto the user.

3.1 Ontology Views

The ontology browser visualizes a view of an ontologyand can be used to produce another view of an ontol-ogy. The view can be the identity (the ontology itself)or it can be a derived view produced by the ontologybrowser or some other application/service. A viewacts as a virtual ontology, without actually storingor materializing the ontology. As a result, a view hasa compact representation and can be changed withminimal overhead. Views are useful for presentingabstractions to the user, for example to hide irrele-vant information from a user with specific needs. Onecan also add or change content in a view. Theoret-ically a view could also incorporate multiple ontolo-gies, although if the ontologies are not orthogonal,there could be naming conflicts that need to be re-solved. In database theory, a view is specifically de-fined as a virtual table constructed from the resultset of a query. For the ontology view, the recordingof the modifications to the ontology is not necessarilya query, but foreseeably, should be translatable intoone.

Figure 1 illustrates the basic architecture of thefoundational ontology, intermediate views, and the

5

Page 6: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 1: The DataLayer produces a view of an on-tology. DataLayers can be chained together, and theview can be materialized to a new ontology.

eventual visualization or other application. A se-quence of chained views is abstractly the source on-tology and indistinguishable from a materialized on-tology. Each of these views is called a DataLayer. Inthe next section, the VisualLayer, which adds visualcomponents to the view for the sake of the visual-ization, will be introduced. The ontology browserspecifically allows the user to issue modifications tothe DataLayer via the VisualLayer. The modifiedDataLayer can be saved as a new ontology view anddistributed to others compactly or materialized.

The ontology definition used here does not includeconstraints on domains and ranges, cardinalities, etc.– for several reasons. First, the original ontology maynot be compliant even if it claims to be so, and it ispointless to try to uphold semantics that are not truein the first place. Second, too many constraints maylimit users’ flexibility, or at the best, create a chal-lenging user-interface problem for the developer. Fi-nally, as a prototype, it is not necessary to solve theproblem fully – rather to provide a template that canbe refined. Elsewhere in the literature, one may seeattributes and relations all grouped together as slots,with constraints called facets. Further definitionsmay follow for distinguishing instances from classesor allowing multiple inheritance, such as for the sakeof allowing an entity to be both a class and an in-

stance. The view taken in this work is that for theuser, attributes and relations have significantly dif-ferent connotations, which warrants reasoning aboutthem separately. A second assumption is that theuser does not care about the subtlety of a top-levelclass inheriting both from root and a template. Rei-fied relations, which are really instances that containproperties associated with the specified relations, arespecifically ignored, but adding them to the frame-work would not be difficult.

3.2 Ontology semantics and modifica-

tions

We have several pragmatic requirements and conven-tions for the ontologies viewed and edited in our sys-tem. First, the ontologies must have a unique slotrepresenting the subclass relationship and must havea unique slot representing a human-readable name(not necessarily the “class name”). These conven-tions are to make the tool easily configurable by theuser without any coding. Additionally, most rela-tions have inverse relations, e.g. subclass/superclass,part/part of), and if a triple (s, p, o) exists and p isinvertible, then the triple (o, p, s) must exist. Thoughnot a strict requirement, we feel that the best user ex-perience results when certain basic inheritance prop-erties are obeyed with respect to subclasses and sub-slots. Slots must not change type. If a slot exists fora class, it exists in subclasses. Values for a subslotare also values for the parent slot. Finally, perhapsthe most controversial, the deletion capabilities onlysupport single inheritance. Many ontology represen-tations support multiple inheritance, but as has beenfound with programming languages, multiple inheri-tance can result in ambiguity. We attempted to for-mulate a series of rules governing if a class is deletedgiven if multiple ancestors are deleted/undeleted anddecided to require a single superclass regarding dele-tions of subtrees. This compromise would still allowsuperclasses in the style of templates, which we’veseen in ontologies; these are analagous to interfacesor mix-ins found in programming languages. Someontologies use facets to represent further constraints(e.g. cardinality of value sets). We do not currentlyuse these constraints for several reasons. First, the

6

Page 7: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

original ontology may not be entirely self-consistentwith respect to these constraints. A novice usermight not need the constraints, and enforcing theconstraints would complicate the interface. Finally,deciding how best to handle constraints (allow theontology to temporarily exist in an illegal state, orrather make atomic operations that always preservethe constraints) is a hard problem that would requiremuch study of users and their needs.

Now that we have specified the semantics requiredfor ontologies in our system, we can define transfor-mations on the ontologies, as well as what needs tobe done to guarantee that all the semantics still hold.We will consider the transformations of adding, delet-ing, or changing classes (entities) and slots. Do to adifference in complexity, we will refer slots that mapto instances or primitives as attributes and slots thatmap to classes and have inverses as relations.

3.2.1 Additions

If we introduce a new entity, it must be assigned aparent. The new entity will inherit the parent’s setof allowed attributes and relations. Adding a newattribute to the ontology requires no further opera-tions. Adding an attribute to an entity means that alldescendents of that entity now inherit the attribute.A relation must be added to the ontology with itsinverse. As with adding an attribute to an entity,adding a relation to an entity means that all descen-dents of the entity now inherit the attribute. A sub-relation can be added to an entity only if the relationalready belongs to that entity.

3.2.2 Deletions

Deleting an attribute from the alphabet of attributesfor the ontology results in all instances of the at-tribute in entities being deleted. Deleting an at-tribute from the allowed list of attributes for an en-tity is only legal if the attribute is not present in theparent (or the entity is the root). Deleting is straight-forward, though one must decide whether to add theattribute to any of the children or to delete the at-tributes from the children as well. Deleting a rela-tion from the alphabet of relations for the ontology

results in all instances of the relation and its inversein entities being deleted. Deleting a relation fromthe list of allowed relations for an entity is only legalif the relation is not allowed for the parent. Againone may decide to add back the relation for a child.Deleting the relation removes its value set from theentity, which in turn means that the value sets forthe inverse relation need to be modified. Deleting arelation deletes its subrelations.

Deleting an entity means that either its descen-dents are deleted too, or their parents need to bereassigned to the parent of the deleted entity. Againthe two choices lead to two editing operations for theuser. All slots associated with the entity are deleted,and the entity is removed from the value sets fromthe inverse relations. The logic can be applied toother user operations, for example, undeleting an en-tity that had previously been deleted. The parentsof the undeleted entity would likewise need to be un-deleted, and it would be a design choice whether ornot to undelete the children.

3.2.3 Changes

The value of an attribute can be changed from unde-fined to defined, or it can be changed from one valueto another. The type of an attribute can only bechanged if for all instances of it, the values are un-defined. A simple constraint extension is to specifywhether or not an attribute can be overriden oncedefined. Changing the value set of a relation on anentity (akin to adding or deleting an edge) simplyrequires the corresponding inverse edge be changed.

A special operation is to change the parent of anentity. To make the entity a child of its grandpar-ent or further ancestor, only the parent edge and itsinverse need to be reassigned. It will keep all thesame sets of allowed attributes, relations, etc. Forimplementation purposes, the entity may have someof those attributes, relations, etc. explicitly allowedfor it rather than inherited from the parent. To makethe entity a child of one of its children or further de-scendent, that descendent becomes the child of theentity’s parent (as just described), and the entity be-comes its child. However in this case, the entity mayneed to inherit some from its new parent. To make

7

Page 8: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

the entity a child of an unrelated entity, there maybe both attributes and relations that are no longerrequired and that it needs to inherit.

3.3 DataLayer: Implementation of an

Ontology View

Now that we have defined the legal transformationson the ontology, we must think how to implement therecording of these changes and the necessary queriesinvolved to obtain the needed information. To sum-marize, the changes include adding or removing fromthe alphabets of entities and slots, adding or remov-ing from the allowed slots for an entity, and modifyingthe values of slots for an entity. Semantics regardingsubclasses, subslots, and inverse relations need to beenforced. Challenges arise when ensuring that theimplementation scales well for large ontologies.

3.4 Ontology Assumptions

To guide our design choices, we will make the follow-ing assumptions about ontologies:

semantics of the original ontology are sound:

Though a user may not need a rich set of se-mantics on a derived view, a minimal set ofsemantics is required for the efficient logicused in the DataLayer. Relations must haveinverses, and each specific edge must have acorresponding inverse (possibly itself). Thisrule allows for an inexpensive reverse lookupand allows for easy forward and backward navi-gation. Additionally simple semantics regardinginheritance must hold (e.g. a slot defined inthe parent must be defined in the child), andsingle inheritance must be present to allow thedelete and undelete subtrees. Note that if a userwants to effectively ignore all inheritance rulesin a view, an option is to flatten the taxonomycompletely.

ontologies are large: We will assume that it is notfeasible to store the whole ontology or even allthe transitive descendents of an entity. As a re-sult, any queries or edits will be propagated to-ward the root. If a child needs to know about a

change above, it will ask its parent (recursively),rather than the parent broadcasting to the chil-dren.

not too many slots: We will assume it is feasibleto store the set of defined slots in memory.

entity-relation graph is sparse: The entity-relation graph is sparse, so it will be best tostore edge-relation graphs as adjacency listsrather than as adjacency matrices.

subclass hierarchy is not too deep: It will notbe too costly to store, query, or traverse all theancestors for an entity.

small set of changes to create a view: A userwill create a view through a small amount ofwork. Hence the view can be represented bya concise set of changes (as a log, as querytransformations, etc.) rather than materializingthe modified ontology.

Most ontologies are small enough where manyof these assumptions are unnecessary. However,the current release of the Foundational Model ofAnatomy[38] has roughly 78,000 entities, 200 types ofattributes and relations, in excess of a hundred thou-sand instances (many just synonyms but some reifiedrelations), and a million entity-relation edges. Itsdepth is 19 (Dorsal digital vein of left big toe, alongwith 17 other veins of the feet, are of that depth).Only the part relation has subrelations. The valuesets for most relations on entities are small, oftenone or two. Notable exceptions are that hundredsof types of tendons have Tendon as a parent, andBona-fide anatomical space has thousands of directinstances.

3.4.1 A compressed representation for the

FMA

The alphabet of attributes, relations, and attributedrelations for the FMA can be assigned to differentbits, and then each all possible combinations of whichare allowed for an entity can be represented by a 32-byte bitfield. In contrast, if pointers were stored perattribute or relation, each would take a full 32 bits.

8

Page 9: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Furthermore, allowed attributes can be stored as dif-ferences from parent to child, and the full attributeset integrated computationally by traversing to theroot. Each entity is stored in a map linking to allits relations’ value sets. The format is packed tobe space-efficient but not necessarily time-efficient;time-efficiency would be irrelevant were disk accessneeded. Attributes that are strings are stored viaLucene (http://lucene.apache.org). The storage iscompact, and the search capabilities are quite im-pressive.

Though a “proprietary” format goes against theprinciples of the web, as long as converters exist,there are many advantages to having a compact rep-resentation. The Protege/MySQL incarnation of theFMA takes in excess of half a gigabyte of disk spaceand takes several hours to download and constructthe database. Furthermore, requiring the additionalapplications MySQL and Protege discourage the av-erage user from dabbling. An additional problemwith MySQL is that it may require administratorprivileges to install. Instead of having to communi-cate over the network because the FMA is too large tostore locally, or spend half a day installing it locally,the whole FMA can be conveniently downloaded inunder a minute as a single payload and be ready togo. Of important concern in the medical communityis that of patient privacy. Note that the compact pro-prietary format could be encrypted, should there besensitive data present in an ontology. Only authenti-cated software could then properly read the data, andno security would be lost by storing the data locally.

Another assumption that is applicable is that foun-dational ontologies such as the FMA do not changeoften. In other words, the user would not have tore-download often, nor would the expensive conver-sion of the FMA to a condensed format have to beperformed often by the curators. Furthermore, if acurator wanted to transmit minor changes about theFMA, the changes could be encoded in a DataLayer—in essence, a patch.

3.5 Implementation

The DataLayer is an abstract interface that loads in-formation from a previous DataLayer, sends infor-

Figure 2: The Inner workings of a DataLayer. Thecached data from the previous DataLayer’s outputplus the modifications are assembled to form thisDataLayer’s output.

mation to the next DataLayer, and perform modifi-cations. At the bottom are implementations of theDataLayer that operate directly on specific databasesor web services. However, the most important Data-Layer is the DataLayer that operates on a previousDataLayer. The ability to chain DataLayers meansthat any sequence of DataLayers can be perceivedas an abstract source, perhaps even materialized asan ontology. In the future, DataLayers might evenload information from multiple ontologies, not just asingle source. Figure 2 shows the insides of a Data-Layer. Each of the other DataLayers in the figure self-similarly has its own other insides respectively. Theasking arrow from a DataLayer to a previous Data-Layer represents the load interface, a set of methodsloading information on entities, relations, attributes,etc., and the sending arrow from a DataLayer to thenext DataLayer represents the send interface, a set ofcorresponding methods to send the data.

3.5.1 Deletions and Exceptions

In the spirit of the ontology assumptions, the Data-Layer represents the deletion of a subtree of entitiesby storing a deletion mark plus a timestamp in a

9

Page 10: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

hash table indexed by the entity. When an entity isqueried, the DataLayer checks to see if any of its an-cestors are marked deleted. Given that the numberof total deletions is likely small and the number ofancestors for an entity is small, this implementationis inexpensive with respect to both space and time.

Representing only deleted subtrees is not expres-sive enough to be useful. The DataLayer augmentsdeletions with exceptions to deletions. When an en-tity is an exception to a deletion, it, all its chil-dren, and all its ancestors are no longer deleted. Thehashtable storing deletions also stores these excep-tions. An exception is marked by denoting the sourceof the exception with a special mark and then mark-ing the source’s ancestors with another mark. Whenthe traversal of ancestors happens, the order of thetimestamps resolves which were deleted before excep-tions or after. To summarize, an entity is considereddeleted in one of two situations:

• the most recent timestamp for a mark of itselfor its ancestors is a delete

• the most recent timestamp for a mark of itself orits ancestors is an except, and neither the entityitself is most recently marked excepted nor dothe entity’s ancestors include the source of theexception

A final sticky point is the interaction with thechange-parent operation and subtree deletions andexceptions. The exception chain described previouslymay be broken by the move, and the new ancestors ofthe entity may have their own deletions/exceptions.There is no correct answer to what the solution is,but one important consideration is to minimize theconfusion of the user. The confusion arises in par-ticular because in the visualization, there is no rep-resentation of what deletions or exceptions were per-formed or when, so the user could be surprised ifa subtree were moved and suddenly all its childrenchanged whether or not they appeared deleted. An-other consideration is that any further changes beeasily undoable. The rule is to preserve the outwardappearance as much as possible upon the move. Theadditional operations are as follows:

• If the entity and the entity’s to-be-reassignedparent both appear deleted, the entity is markeddeleted and the parent is reassigned.

• If the entity appears deleted but the entity’s to-be-reassigned parent does not appear deleted,then the entity is marked deleted and the parentreassigned.

• If both the entity and the entity’s to-be-reassigned parent do not appear deleted, thenthe appearance of the subtree needs to be pre-served. The most recent of the deletes and ex-cepts performed on the entity’s ancestors areperformed on the entity itself, but its timestepis made current. This mark also has an extraannotation that any timestamps in the descen-dents trump any timestamps of value less thancurrent residing in the new ancestors. The par-ent is reassigned.

• If the entity does not appear deleted but the en-tity’s to-be reassigned parent appears deleted,then further specification is needed whether todelete the entity or to except the to-be reas-signed parent. Once either choice is performed,the state resolves to one of the previous cases.

4 Ontology Browser Interface

The browser visualizes an ontology as a graph, withthe ontology’s entities as nodes and relations as edges.The goal is for the visualization to be used for bothexploration of the ontology and constructing newviews. As opposed to other ontology visualizationsthat try to visualize all entities or all relations (if notfiltered), the approach taken here is a compromise—some entities and one or two relations. In some cases,these other applications may have specific reasonsfor their choices, such as wanting to reveal to theviewer the overarching topology of a complex net-work. There are two reasons for the compromise, andboth address the needs for the user. The first is thatthe user wants to see a network with a potential depthof three or more (showing just one or two deep often isnot interesting and can be accomplished equally well

10

Page 11: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

as a hyperlinked list), and it is important for the userto see the names of all the nodes at once. The secondreason is that the user cannot comprehend more thana few relations at once, not even counting the extremeclutter that all those nodes and edges would cause.The clutter of nodes is exacerbated by the fact thatfor an ontology, unlike a general network, the labelshave great significance and should be easily readable,which means that the labels occupy much space.

The interaction was designed with several princi-ples to improve the usability and avoid any confusion.

smooth transitions: Smooth transitions helpmaintain the cognitive connection between theold view and the new view.

zoomable: A zoomable interface offers flexibility forhaving a big layout with small labels that getlarger with zooming, or to having a node producedetails with zooming.

speed: The software needs to be fast to be interac-tive. Queries should be kept to a minimum, andinformation should be cached.

lazy evaluation: The whole reference ontology can-not necessarily exist in memory at once, andthere should not be a need to read all of it.Visible nodes and edges should be cached andswapped as needed.

visibility vs existence: It should be clear whennodes and edges are invisible on-screen versusif they exist in the ontology.

reversibility: Any change to the state of the visu-alization or modification to the ontology shouldbe undoable in succession.

changed vs original: The visualization shouldhave the ability to display what nodes and edgesare part of the original reference ontology andwhat are modified/new/missing.

memory of state: When the user collapses a sub-tree and re-expands it, or switches from one pri-mary relation to another and back, the older lay-out should be replicated. Furthermore, the usershould be able to save the state of the system (as

a workspace), close the program, and start againthe following day as if the closing and restartinghad not happened.

4.1 VisualLayer

The VisualLayer stores visual data about the enti-ties and relations and acts as the go-between for theDataLayer and any interface. Many of the designprinciples for the DataLayer apply, but they may beeven stronger for the VisualLayer. Notably, moreentities and relations than are stored in the Data-Layer’s cache are needed for the visualization, andthe visualization has even tighter time and spaceconstraints. For each entity, extra visual informa-tion includes whether the node is visible, has beentouched by the mouse, and is tagged a certain color,as well as temporary variables needed for the layout.Even more space-consuming is for each combinationof entity and relation, whether all of its edges havebeen loaded, if it is currently expanded or collapsed,if there are deleted edges being hidden, what sub-relations are visible, etc. All of this information isstored in bitfields for compactness and with a lazydata structure so that no space is needed to storestate on unaccessed data.

Another function of the VisualLayer is to decidewhat edges to load. Certainly if the user asks toexpand all the outgoing edges for a relation on anentity, all those edges will be expanded. But sup-pose the user switches to a different relation. Whatedges should be loaded? Logically, the edges connect-ing nodes that have been previously touched shouldbe expanded. Because there are so many differentrelations and a user may end up looking at only ahandful of them, loading these edges happens on anas-needed basis. The answer is that no new nodesare loaded, but any existing edges that have not beenloaded into the VisualLayer are loaded. To performthis process efficiently, nodes are tagged as clean ordirty, per relationship. By default, nodes are dirtyin all relationships. A node becomes clean for a rela-tionship when all of its incoming and outgoing edgeshave been explored4, or the method to search for un-

4For relations that are constrained to for trees, it suffices to

11

Page 12: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

loaded edges has processed it. The method searchesthe dirty and skips the clean ones.

4.2 Navigation

The ontology browser allows a user to interact quicklyand easily with a large ontology. In addition to beinga browser, the program also supports the construc-tion of a view of an ontology – adding a layer on topof the ontology that gives the perception that theontology has been modified. Modifications includeadding and deleting terms (classes) and redefiningthe relationships between terms (by adding and delet-ing individual links). All operations are available byright-clicking on a node and choosing an option froma popup menu. Operations that modify are all con-tained under a submenu modify. Common naviga-tion operations are also available via single-clicks ordouble-clicks with the left mouse button. Each entityin the ontology is a node with its name inside, andedges represent relations. Additionally, the numberof visible edges around the node along with the totaledges around the node can be displayed as part of thename, and the name can be abbreviated to conservespace on the display. The user can hover over a nodeor an edge to get its full name or meaning. Normallyonly one type of relation can be seen at once. Thecurrent relation is printed in the top left. An arrow inthe middle of the edge indicates the edge’s direction.A bidirectional edge (a symmetric relation) has twoarrows. The layout is a radial tree layout based onYee et al.[51], with extra code to handle cross edgesand back edges that do not arise in a pure tree.

Initially just a single node is visible. To see morenodes, data must be loaded from the backend, whichthe user does by expanding the node. This data isnot loaded automatically, as it is in other visualiza-tions, for two reasons. The first is that it may beslow to load the data, and second, the user may notwant to see all the other data, as a relevant subsetmay already be loaded. Figure 3 shows a sequenceof operations while navigating the ontology. The firstthree operations are expanding the related entities for

check consistently either incoming or outgoing edges, and fora relation such as subclass, checking just the incoming edgesis far more efficient.

the selected relation, in this case, part. The nodes ap-pear in a tree rooted at the original node. Because itis important to distinguish whether the visible neigh-bors of a node are actually all the neighbors (for theprimary relation) or not, a solid border denotes thatall neighbors of the node for the current relation havebeen loaded, whereas a dashed border denotes thatthere are still more neighbors. For example, in thefigure, the stomatognathic system has no neighbors,because there are no visible neighbors and the borderis solid. In addition, the edge counts (visible/total)inform the user if all the edges are displayed.

Shown at the bottom of the figure, the fourth oper-ation changes the root of the tree. Whereas other vi-sualizations may show the tree of just parts descend-ing from the root, the ontology viewer also shows theinverse (part of) branching from the root as anothertree. These two trees are disjoint aside from the root,that is, there cannot be any edges between the reg-ular tree and the inverse tree. Choosing a new rootfor the tree can be used both for seeing the relation-ship and inverse relationship of the new root and forhiding cousin nodes that are irrelevant. If the userdoes not need the context of the inverse tree, it canbe disabled via a checkbox in the menu. For conve-nience, setting the root can also be performed via aleft double-click.

Once the user has expanded several nodes and de-cides there are too many visible on the screen, theuser can collapse a subtree of nodes. The operationis available on the right-click menu; additionally asingle left click performs an expand or collapse thatdoes not load new data from the back end. Figure 4shows a collapse operation. The top figure shows theview before the collapse, and in the bottom figure, thelarynx is collapsed. The blue outline around the lar-ynx denotes that it has been collapsed, rather thanthat it has no neighbors. Both changing the rootand collapsing nodes work well for hiding irrelevantnodes. Figure 5 shows a cluttered view of bone sub-classes, whereas figure 6 has long bones set as root,and figure 7 has long bones and flat bones collapsed.

Another navigational tool is to switch the relationbeing shown, making the clicked node the new root.As mentioned for the VisualLayer, relevant edges areloaded from the back end, but no new nodes are

12

Page 13: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 3: The top diagram shows a part hierarchy re-vealed from three consecutive expansions by the user.Then the user changed the root of the layout, creatingthe view in the bottom.

Figure 4: The top figure shows the layout before thecollapse operation, and the bottom figure shows thelayout afterward.

13

Page 14: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 5: A view of the taxonomy of human bones.The view is cluttered, and labels are overlapping.

Figure 6: The user can set the root to just “Longbone” to study that information with less clutter.The root can be set back to “Bone” later.

14

Page 15: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 7: Another way to reduce clutter is to collapseother subtrees. Here the short and irregular bonescan be easily seen because the long bones and flatbones have been collapsed.

Figure 8: View with several parts expanded.

loaded. This context is extremely useful and canbe considered an approximation of degree-of-interest.For example, suppose the user loads some nodes inthe part relationship as in figure 8 but then wantsto change the viewed relationship to subclass. Thebrowser produces the subclass view seen in figure 9. Ifall siblings of the visible nodes were displayed, the re-sulting view would be largely irrelevant; in figure 10,only the yellow nodes are the relevant ones (the nodesvisible in figure 9). When a user changes back toa previously visited relation, the state (e.g. whichnodes were expanded/collapsed) is remembered.

4.2.1 Secondary relationships

Sometimes it can be useful to see more than one re-lationship at a time to understand a more complexsituation. Other visualizations display more than onerelation at a time by default and rely on the user tofilter the relations. The approach taken here is differ-ent, in that by default, just one relation (and perhaps

15

Page 16: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 9: View switched from part to subclass. Thenodes shown are from the context of the previousviews.

Figure 10: If the subclass hierarchy were expandedsuch that all the nodes in the original part view werevisible, it would look like this. The nodes displayedin the relevant view (previous figure) are highlightedin yellow.

16

Page 17: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 11: Branches of coronary arteries as the pri-mary relationship and arterial supply of as the sec-ondary relationship.

its inverse) is shown, and the user can opt to showone or two secondary relationships. The secondaryrelationships are only shown one level deep beyondthe primary hierarchy, so they deliver context with-out clutter. Figure 11 and figure 12 show two exam-ples of secondary relationships. The first primarilyshows the branching of the coronary arteries and sec-ondarily shows the regions they supply. The secondprimarily shows the partitions of the heart and sec-ondarily shows the arteries that supply them. Hav-ing more than one secondary relationship is usefulfor lymphatic chains, where one might be concernedabout lymphatic drainage, afference, and tributaries.

4.2.2 Other non-modifying interaction

Several other features are available that have not yetbeen discussed. If there are too many nodes on thescreen, the user can artificially restrict the maximumdepth to a small number.

Figure 12: Parts of the heart as the primary rela-tionship and arterial supply as the secondary rela-tionship.

17

Page 18: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

The browser supports differentiating between sub-relations for a given relation, e.g. regional part andconstitutional part, which are types of part. Sub-types of edges are color-coded differently, and in ad-dition, the user can elect to load just one subtype orrestrict the expansion of neighbors to just one sub-type. Because the primary relation is the superrela-tion, a node with all regional parts shown may stillhave a dashed border, because other types of partsare not being shown,

The user can view the details of a node, whichshows a text document of the entity’s attributes andrelations, all hyperlinked. A work in progress is totoggle between the document showing the originalontology and the current transformations, as well asallowing editing through the document. Having sev-eral parallel ways of accomplishing the same task isa blessing to the users, so they can prefer what theyare more familiar/comfortable with, or they can usethe appropriate tool for their own use, of which I amnot yet aware. A potential future item is the slickzoomable interface in Jambalaya, where instead ofthe document appearing in a new window, it appearsby zooming into the node.

Finally nodes can be tagged or untagged one of tendifferent colors, bright colors roughly spanning thehues of a rainbow. The color choice was designed sothat no two colors are likely to be confused with eachother or with the otherwise used colors (pale greenand pale red). The purpose of tagging is to provide avisual reminder for the user or to assign some addi-tional meaning to a set of nodes, with the cue beingthe color. Tagging is additionally supported in thesearch and query interfaces, to be described in latersections. A later section will also discuss advancedapplications of tagging for a specific use case. A workin progress is to provide a means for the user to an-notate what the tags mean, for their own referenceand for others who may use the workspace later.

A fundamental design decision was that nodes canbe tagged multiple colors, which provides much morepower to the user, as the user can then operate onintersections, unions, or subtractions on sets of tagcolors. The difficulty though is how to display themultiple colors assigned to a given node. Three ideasinclude having multi-tagged slowly animate their col-

ors in order, having multi-tagged nodes display col-ored stripes showing all their different tags, and hav-ing a button that cycles through the tagged colors (inessence, animation on demand). The first and last,which were the easiest to implement, were tried, andthe last was chosen because the first was too distract-ing. One problem however is how to notify the userif there are indeed multi-tagged nodes on the screen.The first attempt was to have a button that is bydefault disabled and is only enabled if there are suchnodes present. However it was too subtle. The strat-egy now is to notify the user via a dialog if the currentview has multiple tags and the previous doesn’t, andthen the user could press the same button to cyclebetween tags. This tactic should provide the neededreminder without being excessively annoying.

4.3 Modification

The user can perform a suite of modifications ac-cessed through the “modify” submenu of the right-click popup menu. To differentiate modificationsfrom the original ontology, new nodes (entities) areshaded pale green, deleted nodes are shaded pale red,new edges are colored green, and deleted edges arecolored red. A user may not want to see deletednodes (or nodes connected by deleted edges), and ifthere are many of them, they can clutter the view.A checkbox toggles the deletions to be hidden, whichcleans the view. However it is desirable to know ifan item is not visible, because an edge to it does notexist or because an edge to it was deleted. A red bor-der (color of the border is orthogonal to its line style)indicates that it has neighbors or incident edges thathave been deleted and hidden. Figure 13 illustratesall of these cases, except the case of a new node.

A user adds or deletes an edge by first initiatingthe add or delete by selecting it from the modify sub-menu from right-clicking on the edge’s source node.As the user drags the mouse to the target node, ared or green arrow is drawn from the source to themouse cursor. Clicking on the target node confirmsthe operation. The original version had no feedbackother than the new edge appearing, but after watch-ing users add edges the wrong way, The current ver-sion has a dialog that frames the meaning of the new

18

Page 19: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 13: This view of parts of the heart showsdeleted nodes, deleted edges, and new edges, as wellas what the view looks like when deletions are hidden.

edge as a sentence, e.g. “heart has part left atrium”and asks the user for confirmation. A difficulty is thatthe slots in an ontology have names such as “part”and “part of” and to a novice user, “A part B” doesnot have a clear meaning. The heuristic used is thata slot name that ends in a preposition (e.g. is, of,with, by) or appears to be a verb should be prefixedwith “is”, and the other slot names should be prefixedwith “has.”

A practical use of deleting edges is to clean up anunnecessarily dense partition that can be easily re-solved by transitivity. Figure 14 shows the clutter re-sulting from the superfluous edges. This redundancywould be difficult to discover in a non-graphical view.The user can select the redundant edges to delete (fig-ure 15) and then when the “hide deletions” check-box is enabled, the graph looks much cleaner (fig-ure 16). Again note the red border, which meansthat there are deletions that are not shown. Whenthe workspace is saved into a view, the view will notcontain any sign of those edges.

Deleting and excepting deletions on entities (andtheir subtrees) follows the rules described in the pre-vious chapter. So that changing a parent is an atomicoperation, rather than an edge addition and an edgedeletion, the standard edge additions and deletionsare not available in the sub/superclass views, and thechange-parents operation is only available in thoseviews.

When adding a new entity, the user needs to spec-ify the parent of the new node. The user does thisby initiating the new-entity operation from the to-be-parent node in the sub/superclass views, or froma to-be-sibling node in any other view. In one ofthese other views, the new node will not be visiblebecause it has no other relationships, so to establishthe cognitive connection with the new node, the viewis changed so that the primary relation is subclass.The node inherits all the slots of the parent, but notnecessarily its values. It only inherits the values ofslots from the parent if the grandparent shares thosesame values. This heuristic infers which propertiesare inherited (e.g. if the entity has a mass) or areoverridden (e.g. the entity’s name). When the userwants to add relations to the new node, the nodewill not be visible in views of relations other than

19

Page 20: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 14: The ontology contains a redundantpartonomy that could be inferred via transitivity. Auser might want to cull the redundant edges.

Figure 15: One by one, the user deleted the redun-dant edges, which appear red.

Figure 16: Finally, the redundant edges can be hid-den from view, with a red border around a node re-minding the user that it has outoing edges that weredeleted.

20

Page 21: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 17: This view was created by deleting the root,undeleting “Body of vertebrate,” and adding somenew nodes (entities).

sub/superclass, and hence, the user cannot click onboth nodes to add an edge between them. Insteada feature called “connect to unseen” exists that letsthe user connect the node to one of a set of recentlytouched nodes.

The top part of figure 17 shows several new nodes,as well as deleted subtrees. When the transforma-tions are saved as a view, the appearance of the viewis as if it were an untouched ontology of its own, asis seen in the bottom of the figure. With just a fewmodifications, the view of an ontology can be changeddramatically.

4.4 Query

The query interface provides a means of making pow-erful queries, though it is not intended for the novice.Computations can be with unary or binary operators(one or two arguments); the results are returned to anoutput list. The output list can be named and storedor piped back to input. The various inputs to thecomputations include all the loaded nodes, nodes thatare visible, nodes that have been touched, nodes thathave been recently touched, the loaded nodes thatare deleted, nodes that are tagged a certain color, theoutput list, and stored lists. In addition, the resultslist as a batch can modify the visualization. The in-terface can be used for several general tasks, such asperforming operations that would be too mundane ortime-consuming to do manually, such as tagging allvisible nodes, or for building complex queries, such asfinding entities that have no subclasses and are notpart of anything. The computations include directand transitive relations, predicates, and set arith-metic. Actions include storing a list, loading, tag-ging, or deleting items, and other commands thatintegrate with the visualization. Figure 18 displaystwo screenshots of the query interface. The first twocolumns show the inputs to the operations; only oneis used for an unary operation. Some of the popula-tions or actions require an additional argument, forexample what color to tag the selected nodes, or thename of a saved list. An example of a complex queryis to find terms that are transitively part of both themale and female bodies but not transitively part ofthe human body. The reason for such a query is thatperhaps these terms should be transitively part ofthe human body instead. Being able to experimentwith such queries through an interface rather thanprogrammatically lets one easily discover and correctinconsistencies that are innate to any huge ontology.

Ideally a powerful query interface could be basedon a visual flowchart or natural language, such asthose investigated by Bernstein’s lab.[26] However,some of the use cases require powerful queries, andthis implementation serves as a placeholder for boththe users and the researchers studying the users orthe queries generated. Also, there exists a tradeoffwith what set of computations to reveal to the user.

21

Page 22: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 18: (top) a query that computes that super-classes of a set of entities; (bottom) a query that com-putes the intersection of two sets of entities

Certainly, if the programmer had unlimited time, anypossible operation could be precoded, but then thechallenge would be how to present it to the user with-out the user becoming overwhelmed with too manytools. At the same time, a small set of tools may besufficient to perform any operation through enoughcompositions, but the logic required to figure out howto chain the computations together is likely too muchfor the average user. An example is finding nodesthat are leaves (have no subclasses). One way to doit is to take the set of all nodes, compute the set ofsuperclasses for this set, and subtract from the orig-inal set. However, this logic is counterintuitive andis completely unnecessary with a “has >1 relateds”predicate. A corollary is that there might be multipleways to express the same computation but one suchway is dramatically more efficient to compute. Thisissue should be the job of a compiler or optimizerthat happens behind the scenes, and users should begiven the freedom to express the queries in whateverways they like.

4.5 Search

Often a user wants to explore a specific entity. Ratherthan browse the subclass hierarchy, it is much moreconvenient for the user to type in the name of theentity directly and go to it. Furthermore, the usermay not know the exact name of the entity, or de-spite the impressive completeness of the ontology, thename may not be present. The system supports threemodes of search of increasing complexity, intended fordifferent tasks. All three modes use Apache Lucene5

as a backend. The only modification is that hits withshorter names are given priority, because of the preva-lence of compound names in anatomy. For example,a query on “lung” would additionally return, amongmany others, “left lung” and “lobe of lung.”

The first search mode is a search textfield on themain browser window. It has no fancy features andis designed for quick access. The user is not given achoice of hits, which is fine because nothing bad canhappen should the hit not be what the user intended.The textfield is augmented with a pulldown menu

5http://lucene.apache.org

22

Page 23: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

that has a list of recently visited nodes that the usercan select for quick access. The second search modeis a full-featured search, where multiple hits can beinvestigated before committing. In an attempt togrant some of the powers of a Lucene query withoutforcing users to learn the language, the interface givesthe options of searching for an exact phrase, searchingfor words that start with a prefix, and searching forwords that sound like the entered word. The hits areculled to show only the most likely. In most cases,this culling keeps the signal-to-noise ratio high, butin a few instances, the desired entity could not befound. A checkbox turns off this culling.

The third search mode is intended for the power-user wanting to do searches on a list of items. Itemsare categorized either as “found” or “not found.” Abatch search attempts to match as many as possi-ble. Each matched item is annotated with how itwas matched (perfect match, matched a synonym,matched this phrase, etc.) so that the user can eval-uate each match. Incorrect matches can be movedfrom “found” back to “not found.” Items that thebatch search could not find can be matched manu-ally by the full-featured search previously described.Selections of found entities can be loaded, deleted,excepted, tagged, etc., as can be done through thequery actions. The entities also can be annotatedso the user can note which had been processed oradd any other commentary. The “found” list can besorted in several ways: the original order of the input,alphabetical by the input terms, alphabetical by thematched entities, how the matches were made (e.g.exact, alternate name, manual, etc.), or alphabeticalby the comments. A batch search can be saved andloaded as well.

Figure 19 shows a screenshot of the Batch Searchin use on the RadLex use case, described in the nextsection. The left column shows terms that have notbeen matched to FMA entities, and the right columnshows terms that are matched. The terms in the leftcolumn highlighted yellow are ones where the searchfound a match that the user deemed incorrect, andthe user moved the terms from the right column backto the left column. The majority of the terms in theright column were exact matches. The terms high-lighted green are examples of terms that did not have

Figure 19: The batch search in use. The itemshighlighted yellow were moved from “found” to “notfound” by the user. The other highlighted items il-lustrate different types of matches.

an exact match; both the search term and matchedterm are shown. The terms highlighted red are exam-ples where there was a match with a synonym or otheralternate name. The terms highlighted blue are ex-amples of terms that the user manually searched for.The manual query is also stored in the comment. Fi-nally, the user added comments to the nodes denotingthat they had been processed.

5 RadLex Use Case

The Radiological Lexicon came to being while theFMA was still being created. As a result, RadLexis not compatible with the FMA, though it would bedesirable to be so. During the testing of the ontol-ogy browser, a set of steps was developed to align thelexicon with the FMA. Potentially, the user interfacecould be tailored to specifically address this use case,but a more generic interface has a wider use and may

23

Page 24: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

allow people to discover new ways of manipulatingdata that developers of an interface could never haveenvisioned.

One useful, but nonintuitive, technique is to startby deleting the whole FMA and then excepting onlythe nodes needed (e.g. for RadLex). Then the usercan start trying to match terms, as was shown in fig-ure 19. The user does not need to match all possibleterms; he or she can go back later and match more,using the comments to remember what has alreadybeen done. A tactic that works well is to sort thefound list by “how found,” undelete, tag, and markthe obvious ones, move the obviously wrong ones backto not found, and defer the rest of the terms (less thana quarter) until later.

Then the user can view the tagged nodes to seehow well connected their partonomies are. If thereis a bridging node that needs to be undeleted but isnot part of the lexicon, that can be tagged anothercolor. If the user is happy with the connectivity ofa set of nodes, they all can be tagged a third color(quickly in the query interface). Once all the termshave been matched, and all the matched terms havebeen tagged this third color, the alignment can beconsidered complete, unless the user has additionalrequirements.

Figure 20 shows a set of matched terms that arewell connected in the part hierarchy. In this case,orange is used to denote that the user is happy withthem and does not need to deal with them further.Figure 21 is a view where the user still needs to re-solve the part relationships for the lime terms. Oncethe user fixes these issues and is satisfies, he or shewill tag them orange. Finally, looking at the subclassview in figure 22 gives an overview of what has beendone and what unresolved problems still remain.

5.1 Tutorial

The ontology browser supports an interactive step-by-step tutorial to help users learn how to use the sys-tem. As the programmer and most experienced userof my own application, I often forget that it has somelearning curve. I designed the controls to be simple,to the point where it feels almost as easy to me asplaying a videogame, but it takes users time to learn

Figure 20: A large set of RadLex terms are connectedvia part relationships. The user highlighted theseorange as a reminder that they are content with them.

Figure 21: Here some terms are connected, but someare not. The lime terms need to be connected byentities that are not part of RadLex.

24

Page 25: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 22: This view gives an overview of how manyRadLex terms that are organs still need to be ar-ranged in part relationships. The user can then clickon one of the lime terms, change the view to part,and work on it, tagging it orange when happy.

an application or even a videogame. The programhas a rough help document, but even with revisionsor perhaps even a document that painstakingly de-tails a walkthrough, users might still get frustrated.One tactic employed in many games is to start offwith a live walkthrough/tutorial that focuses on onlylimited features and expands the capabilities oncethe user has mastered the basics. The educationalsoftware Alice6, which has a visual programming en-vironment for creating stories/animations, has someexcellent interactive tutorials. As an experienced user(the tutorial author) builds a workspace, he or shecan also insert comments. The workspace can thenlater be played back in a special mode (the tutorialmode) where the comments appear in a tutorial win-dow one at a time and only advance if the user is ableto duplicate the next logged action. When a userperforms an incorrect action, the system can checkif if was the correct type of action or if the argu-ments for the action (e.g. clicked on the right node)were correct, and give the user feedback. Additionalcontext-dependent help could be built into each type

6http://www.alice.org

of logged action. There are currently three tutorialsfor the system. The first focuses on navigation, thesecond on adding and deleting nodes, and the thirdon adding and deleting edges.

6 Evaluation

The ontology browser is undergoing iterations of userevaluations and refinement. Many ideas, such as theneed for a tutorial, having a single “expand” operatorthat abstracts away loading from a back end, and dis-playing the counts of branches, originated from userfeedback. Early feedback consisted of much frustra-tion of “not knowing what to do” and being over-whelmed with all the features and the subtle visualcues; each of these has been addressed, and subse-quent evaluations have shown marked improvementThe evaluations have been productive in identifyingannoyances, points of confusion, and other usabilityproblems. Feedback so far is positive regarding use asa browser for the FMA. Users felt they learned aboutthe FMA with the tool, and they would use the toolagain. They spoke highly of the search and historycapabilities.

Fourteen people who tried the browser unsuper-vised filled out an online survey. The full survey is re-produced in Appendix A. The first part of the surveyassessed their familiarity with ontologies, anatomy,computer applications, database queries, and theFMA. The vast majority rated themselves as hav-ing above-average familiarity with computer applica-tions and queries and average to below-average famil-iarity with anatomy. The one user who was an ex-pert in anatomy had only a layperson’s knowledge ofanatomy. Another user was very computer savvy, yethad almost no anatomical knowledge. Another userwas savvy in both anatomy and computers. Giventhat most of the audience were people associated withbioinformatics, most were at least acquainted withontologies and the FMA. Given the size of the sam-ple and the skew of the users, no reliable correlationscan be drawn between their prior knowledge and theirevaluation of the system. A future study should tar-get either people who are more allied with biologyor people who are downloading ontologies from the

25

Page 26: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

National Center for Biomedical Ontology.A series of questions assessed the users’ effective-

ness and satisfaction with the tool on a discrete scaleof 1 to 5 from strongly disagree to strongly agree.Table 1 shows the results, which were generally posi-tive aside from performing complex queries. Half theusers spent less than half an hour with the system;one user logged more than two hours. There do notappear to be any significant correlations between thetime spent using the system and the subsequent eval-uation ratings.

One question asked the user to state the most in-teresting thing learned about the FMA from the soft-ware. This question was designed to assess insightand discovery from the undirected exploratory pro-cess of browsing. Unfortunately users did not replywith exact examples, either because they did not readthe survey until afterward and did not remember theexact examples, or the question was too vague. Nev-ertheless, many users noted that the browser enabledthem to fathom the complexity of the FMA especiallywhen looking several levels deep. This complexitycannot be seen in an indented list or a Protege frame.One user discovered that subclass/superclass and in-stance/type seemed to be mostly duplicates and wasconfused. This occurrence further motivates the needfor a view. Non-expert users are probably not awareof the difference that Protege makes between a classand instance; a “cleaned” view of the FMA couldhave instance and type relations removed.

A series of questions inquired on the likes and dis-likes for the primary visualization, the search fea-tures, and the query features. Overall, users liked theaccessibility of commands in the right-click menu, thenavigation, and the layout. Some users had concernsthat they would have liked to have a certain feature,but that feature was in fact present and additionallydocumented in the tutorial. Examples included theability to restrict the depth of the layout and havinga frame-like list of all the relationships for an entity.It is not known if the users in fact actively and whole-heartedly participated in those tutorials. Other con-cerns included the desires for features such as beingable to move nodes manually, better handling of over-laps of nodes, less finicky performance of tool tipsfrom hovering over edges, and disambiguating the

subrelationships with a legend and a more specificmouse-over. For the search, a few complained thatone had to hit “go” to search, rather than just pressenter in the search box — a habituation issue. Otherscomplained that the quick search did not give themmultiple search results or did not correct misspellings.The search window gives multiple search results andoptionally can search for misspellings. Perhaps thetutorials need to discuss the search window, or thequick search should incorporate features of the searchwindow. No one experimented much with the querywindow; clearly the user interface needs to be mademore novice-friendly, which would be a significant un-dertaking. As some users suggested, a tutorial for thequery window would be a good start.

Finally, users listed other suggestions or desiredfeatures for the ontology browser. They wanted to beable to select several nodes at once, such as with alasso or by control-clicking, and perform an operationon the selected set, for example deleting. In certaincases, some actions may not produce easily notice-able visual changes; a cue such as a beep may help.One user suggested that if one were to search for anode that is already visible in the layout, it would behighlighted rather than become the new root. Userssuggested a legend to explain the colors and styles ofthe borders and edges, and perhaps the borders andedges being thicker and their colors more distinct.

The evaluations just described are more qualitativein nature than quantitative. I have plans to conducta comparison of my system with other ontology visu-alization tools, such as Jambalaya[43] and TGVizTab[4]. The comparison would consist of obtaining tim-ing results on tasks and measuring effectiveness, ef-ficiency, satisfaction via a survey, e.g. the SystemUsability Scale (SUS). Given that the previous stud-ies have been somewhat biased toward people with astronger computer background than a biology back-ground, this study should focus on a population withmore of a biology background. Unfortunately, I havenot been able to find another graphical tool, includ-ing the two tools just mentioned, that is capable ofpractically visualizing the FMA, because it is so large.Instead I plan on using a small meaningful subset ofthe FMA, namely a knowledge base of the RadLexabdominal terms, for running the user experiments.

26

Page 27: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

survey question mean stddev(1: strongly disagree – 5: strongly agree)

From using this software, my understandingof the FMA has improved. 3.86 0.363I would use this software in the futureto explore the FMA. 4.00 0.784Using this software, I could effectivelysearch for terms in the FMA. 4.21 0.802Using this software, I could do simplequeries on the FMA. 3.79 0.802Using this software, I could do complexqueries on the FMA. 3.00 0.877The help documents were useful. 4.5 0.519I did not get confused much. 3.29 0.994I feel like I learned how touse this software. 3.86 0.535

Table 1: Users’ evaluations of the ontology browser

7 Conclusions and Future

Work

The architecture for constructing views is powerfuland flexible and should be easily extensible for in-corporating queries. The vision is that in the fu-ture, the architecture would be like figure 23, withthe database views chained together transformingqueries, but still fit smoothly into the existing visu-alization framework. Straightforward extensions ofthe DataLayer would include incorporating furtherconstraints (facets), and better support for instances.Additionally, it may be desirable to materialize thelogs (or query transformations) plus the source on-tology into a new ontology.

An extension of the DataLayer allows it to mas-querade as a triple store and serve as the back endfor Jena7, a middleware for Semantic Web applica-tions. A SPARQL8 query through a webservice isdecomposed by Jena into individual requests for setsof triples. One challenge is that a request may askfor many triples (e.g. triples with any subject andany object but a specific predicate), and Jena would

7http://jena.sourceforge.net8http://www.w3.org/TR/rdf-sparql-query/

internally filter the triples. This inconvenience stemsfrom the fact that SPARQL’s expressiveness is richerthan that of SQL, so a more specific query may notbe capable of being sent to the back end. SPARQLdoes allow the user to request a query to be passed onby Jena, but this feature removes the notion of an ab-stract source and requires the user to possess expert-level knowledge on how the view was constructed.The best solution may be for the chaining of views tobe performed entirely in SQL, so that queries may bebetter transformed and passed on, rather than costlyextra data be transferred per each link of a poten-tially complex chain of views.

An excellent direction for future work would beto try to consolidate the modifications by removingredundant operations, refactoring operations, or de-ducing equivalent queries (presuming the source viewdoes not change)—in essence, taking advantage of theKolmogrov complexity of a set of modifications. Asimple transformation involving deletions and excep-tions could involve recognizing when all the childrenof an entity are deleted (or respectively excepted)and instead mark equivalently that the entity itself isdeleted or excepted. Other transformations could beoperating on the result of a query instead of operat-ing on an explicit set. Such an optimization is really

27

Page 28: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Figure 23: Database views chain together via querytransformations. On the frontend, a DataLayerqueries the database view and provides informationfor the visualization.

only appropriate if the queries are on the previousview or the current view is frozen (saved), as anychanges to the view could affect the inferred queries.Very likely, any work on optimization would be nec-essary as a part of translating the modifications intoqueries.

Originally the design was expected to include moresemantic inference.. The inference was much more in-tricate than expected, and the original ontology wasnot consistent over the additional logic. The onlyinferences made were with relations having inversesand inheritance with subclasses. Much work couldbe done with defining inferences (and their rules withrespect to editing) and experimenting with their use-fulness. One area that was specifically ignored is mul-tiple inheritance. What happens to an entity whenone parent is deleted but not the other? Perhaps oneway to solve the problem is by using Java’s approach:a class can have just one parent but can implementmany interfaces. In this respect, an entity has a singlesuperclass but perhaps multiple templates, which canbe deleted. Removing a template removes its respec-tive slots. Additional rules would need to be definedto avoid naming conflicts between slots in multipletemplates.

The ontology browser enables a user to navigateand edit (create a view) of an ontology quickly andeasily. Furthermore, the compact representation ofthe FMA that can be used by the browser allowsboth the browser and the ontology to be delivered asa small payload, letting users experiment with min-imal startup expense. To make the system ready todistribute to anyone, more work still needs to be donefor usability, especially with constructing tutorialsand walkthroughs to get a user up to pace. Addi-tionally, the system could prompt the user with hintsas the user is exploring. Also, the editing featurescould be stripped down so that it is simpler to use(as a pure browser). On a different note, the browsercould be embedded into Protege, though for best re-sults, it should have the option of spawning a newwindow, rather than remain embedded in a pane ortab.

As more feedback is obtained from users, more ad-vanced edits will be made available that are com-positions of more fundamental edits. For example,“bypass me” for subclasses could consist of redirect-ing all the node’s children to have their parents bethe node’s parents. The challenge would be to workthese feature into the user interface without makingthe menus too complex or the operations (click, drag,etc.) too overloaded. Right now dragging is reservedfor panning and zooming. Other software takes ad-vantage of a third mouse button (not present on ev-ery computer, especially laptops) or use of the controlkeys. Having to remember when to press shift, alt,or control is hard for the user to learn, so gesturalmouse strokes may be better. Another option is tohave different modes of interaction that the user tog-gles among.

Many ontologies have a variety of constraints, in-cluding on the domain and range for relations, on thevalues of attributes, and on the cardinality of the val-ues. It would be a challenge to have a simple editinginterface that does justice to these constraints. Somechanges can only be done by temporarily violatingthe constraint so that it can be satisfied upon thenext step. An example of such a change is chang-ing a parent of a subclass, where the system forcesthe change to be performed atomically. As more con-straints get enforced (e.g. cardinality constraints, of

28

Page 29: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

which single-inheritance is an example of), definingthe set of operations that need to atomically com-pose more primitive operations would get harder andharder. When users attempts to perform a change,they should be warned that an action may violatethe constraints, and if so, why. Ideally the interfacewould give cues to what operations would be legal sothat it is not trial-and-error. For that matter, someusers may specifically want to ignore the constraints.The danger is that once constraints are ignored, itwould be difficult to restore the system to a con-strained state again if so desired, without undoingthe intervening changes. A system such as Promptcould help resolve the conflicts.

The ontology browser uses a radial tree layout,which tends to perform well for ontology connectiv-ity, even if there is the occasional cycle. There arecertainly other layouts, such as force-directed layoutsor treemaps. A fundamental challenge for any layoutis label placement, since labels are so crucial for theunderstanding of an ontology. Distortion, such asa fish-eye lens or hyperbolic geometry, may help indealing with many nodes on the screen. Similarly, alarge ontology displayed on a virtual 10Kx10K screenwould benefit from a picture-in-picture showing anoverview with the current view framed. Some densecliques are present in the FMA, e.g. myocardial zoneconnectivity; specialized (and likely costly) layoutsare needed to view these optimally

All types of layout will suffer from the situationwhen a node has thousands of children, as is the casewith “tendon” or “ligament.” For this situation, itis best not to render each of the children as separatenodes, but rather as an “imposter” representing allof them (or at least all of them that have no othercontext) as a sector of a ring. The imposter couldhave a special interface to scroll through the entities,perhaps with an embedded fisheye.

The FMA has several types of instances. Someare trivial, such as synonyms that are really just thename plus an author and date. However, the reifiedrelations (e.g. attributed part, attributed continu-ous with) map to attributes that have specific infor-mation qualifying the relationships. These instancesmay be best viewed as text, though it may be possibleto encode some of the attributes of the instances as

colors or some other schema. Additionally, some ofthe instances have directions/orientations associatedwith them, and these could be used geographicallyfor the layout.

References

[1] James Agutter, Noah Syroid, Frank Drews,Dwayne Westenskow, Julio Bermudez, andDavid Strayer. Graphic data display for cardio-vascular system: Case study. In IEEE Sympo-sium on Information Visualization, 2001.

[2] C. Ahlberg and B. Schneiderman. Visual in-formation seeking: Tight coupling of dynamicquery filters with starfield displays. In Proceed-ings of CHI, 1994.

[3] David Akers, Anthony Sherbondy, RachelMackenzie, Robert Dougherty, and Brian Wan-dell. Exploration of the brain’s white matterpathways with dynamic queries. In IEEE Visu-alization, 2004.

[4] Harith Alani. TGVizTab: An ontology visuali-sation extension for Protege. In Proceedings ofKnowledge Capture, Workshop on VisualizationInformation in Knowledge Engineering, 2003.

[5] Robert Albert, Noah Syroid, Yinqi Zhang, JimAgutter, Frank Drews, Dave Strayer, GeorgeHutchinson, and Dwayne Westenskow. Psy-chophysical scaling of a cardiovascular informa-tion display. In IEEE Visualization, 2003.

[6] M. Antoniotti, I. T. Lau, and B. Mishra. Natu-rally speaking: A system biology tool with natu-ral language based interfaces. In Biological Lan-guage Conference, 2004.

[7] Chandrajit Bajaj, Peter Djeu, Vinay Siddavana-halli, and Anthony Thane. TexMol: Interac-tive visual exploration of large flexible multi-component molecular complexes. In IEEE Vi-sualization, 2004.

[8] C. A. H. Baker, M. S. T. Carpendale,P. Prusinkiewicz, and M. G. Surette. GeneVis:

29

Page 30: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

Visualization tools for genetic regulatory net-work dynamics. In IEEE Visualization, 2002.

[9] Giusseppe Di Battista, Peter Eades, RobertoTamassia, and Ioannis Tollis. Annotated bibliog-raphy on graph drawing algorithms. ComputerGeometry: Theory and Applications, 4:235–282,1994.

[10] Abraham Bernstein and Esther Kaufmann.GINO - a guided input natural language on-tology editor. In 5th International SemanticWeb Conference (ISWC 2006), pages 144–157.Springer, November 2006.

[11] Stuart K. Card, Jock D. Mackinlay, and BenSchneiderman. Readings in Information Visual-ization: Using Vision To Think. Morgan Kauf-mann Publishers, Inc., 1999.

[12] Jean-Louis Coatrieux and James Bassingth-waighte. Scanning the issue: Special issue onthe physiome and beyond. Proceedings of theIEEE, 94(4), 2006.

[13] Daniel L. Cook, Jose L. V. Mejino, and Cor-nelius Rosse. Evolution of a foundational modelof physiology: Symbolic representation for func-tional bioinformatics. In MEDINFO, 2004.

[14] Daniel L. Cook, Jesse C. Wiley, and John H.Gennari. Chalkboard: Ontology-based pathwaymodeling and qualitative inference. Preprint,2007.

[15] Edmund J. Crampin, Matthew Halstead, Pe-ter Hunter, Poul Nielsen, Denis Noble, Nico-las Smith, and Merryn Tawhai. Computationalphysiology and the physiome project. Experi-mental Physiology, 89(1):1–26, 2004.

[16] Tricia d’Entremont and Margaret-Anne Storey.Using a degree-of-interest model for adaptive vi-sualizations in Protege. In Proceedings of the 9thInternational Protege Conference, 2006.

[17] Deborah Dowling. Experimenting on theories.Science in Context, 12(2):261–274, 1999.

[18] Danny Holten. Hierarchical edge bundles: Vi-sualization of adjacency relations in hierarchicaldata. IEEE Transactions on Visualization andComputer Graphics, 12(5), 2006.

[19] Xiaodi Huang, Peter Eades, and Wei Lai. Aframework of filtering, clustering and dynamiclayout graphs for visualization. Conferences inResearch and Practice in Information Technol-ogy, 38, 2005.

[20] Peter J. Hunter. Modeling human physiology:The IUPS/EMBS physiome project. Proceedingsof the IEEE, 94(4), 2006.

[21] Chris Johnson. Top scientific visualization re-search problems. IEEE Computer Graphics andApplications, 24(4), 2004.

[22] Chris Johnson, Robert Moorhead, Tamara Mun-zner, Hanspeter Pfister, Penny Rheingans, andTerry S. Yoo. NIH/NSF Visualization ResearchChallenges Report. IEEE Press, 2006.

[23] Ira Kalet, Mark Whipple, Silvia Pessah, JerryBarker, Mary Austin-Seymour, and LindaShapiro. A rule-based model for local and re-gional tumor spread. In Proceedings of AMIA,2002.

[24] Akrivi Katifori, Constantin Halatsis, GeorgeLepouras, Costas Vassilakis, and Eugenia Gi-annopoulou. Ontology visualization methods —a survey. ACM Computing Surveys, 2007 (toappear).

[25] Akrivi Katifori, Elena Torou, Constantin Halat-sis, Georgios Lepouras, and Costas Vassilakis.A comparative study of four ontology visualiza-tion techniques in Protege: Experiment setupand preliminary results. In Proceedings of Infor-mation Visualization, 2006.

[26] Esther Kaufmann and Abraham Bernstein. Howuseful are natural language interfaces to the se-mantic web for casual end-users? In 6th Interna-tional Semantic Web Conference (ISWC 2007),pages 281–294, 2007.

30

Page 31: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

[27] Roy C.P. Kerckhoffs, Maxwell L. Neal, Quan Gu,James B. Bassingthwaighte, Jeff H. Omens, andAndrew D. McCulloch. Coupling of a 3d finiteelement model of cardiac ventricular mechanicsto lumped systems models of the systemic andpulmonic circulation. Annals of Biomedical En-gineering, 2007.

[28] William D. Lakin, Scott A. Stevens, Bruce I.Tranmer, and Paul L. Penar. A whole-bodymathematical model for intracranial pressuredynamics. Journal of Mathemetical Biology,46:347–383, 2003.

[29] Patrick Lambrix, Manal Habbouche, and MartaPerez. Evaluation of ontology development toolsfor bioinformatics. Bioinformatics, 19(12):1564–1571, 2003.

[30] Catherine M. Lloyd, Matt D.B. Halstead, andPoul F. Nielsen. CellML: its futre, present andpast. Progress in Biophysics and Molecular Bi-ology, 85:433–450, 2004.

[31] Max Lewis Neal and James B. Bassingthwaighte.Subject-specific models for the estimation of car-diac output and blood volume during hemor-rhage. submitted to Critical Care Medicine,2007.

[32] Chris North. Toward measuring visualization in-sight. IEEE Computer Graphics and Applica-tions, 26(3), 2006.

[33] Natasha F. Noy and Mark A. Musen. ThePROMPT suite: Interactive tools for ontologymerging and mapping. Technical report, Stan-ford Medical Informatics, 2003.

[34] J. Tinsley Oden, Ted Belytschko, Jacob Fish,Thomas J.R. Hughes, Chris Johnson, DavidKeyes, Alan Laub, Linda Petzold, DavidSrolovitz, and Sidney Yep. Revolutionizing en-gineering science through simulation. NSF BlueRibbon Panel on Simulation-based EngineeringScience, 2006.

[35] Mette S. Olufsen, Ali Nadim, and Lewis A. Lip-sitz. Dynamics of cerebral blood flow regulation

explained using a lumped parameter model. AmJ Physiol Regulatory Integrative Comp Physiol,282:R611–R622, 2002.

[36] David Stephen John Perrin. PROMPT-Viz: On-tology version comparison visualizations withtreemaps. Master’s thesis, University of Victo-ria, 2001.

[37] George Robertson, Kim Cameron, Mary Czer-winski, and Daniel Robbins. Polyarchy visual-ization: Visualizing multiple intersecting hierar-chies. In CHI, 2002.

[38] Cornelius Rosse and Jose L. V. Mejino Jr. Areference ontology for biomedical informatics:the foundational model of anatomy. Journal ofBiomedical Informatics, 36:478–500, 2003.

[39] Daniel L. Rubin, David Grossman, MaxwellNeal, Daniel L. Cook, James B. Bassingth-waighte, and Mark A. Musen. Ontology-basedrepresentation of simulation models of physiol-ogy. In AMIA Annual Symposium Proceedings,2006.

[40] Purvi Saraiya, Peter Lee, and Chris North. Vi-sualization of graphs with associated timeseriesdata. In IEEE Symposium on Information Vi-sualization, 2005.

[41] Purvi Saraiya, Chris North, and Karen Duca.An insight-based methodology for evaluatingbioinformatics visualizations. IEEE Transac-tions on Visualization and Computer Graphics,11(4), 2005.

[42] N. P. Smith, D. P. Nickerson, E. J. Crampin, andP. J. Hunter. Multiscale computational mod-elling of the heart. Acta Numerica, pages 371–431, 2004.

[43] M. A. Storey, M. Musen, J. Silva, C. Best,N. Ernst, R. Fergerson, and N. Noy. Jambalaya:an interactive environment for exploring ontolo-gies. In Intl Conference on Intelligent User In-terfaces, 2002.

31

Page 32: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

[44] Teranode. Leveraging pathway analytics for lifesciences research and development, 2005.

[45] Edward R. Tufte. Envisioning Information.Graphics Press, 1999.

[46] Edward R. Tufte. The Visual Display of Quan-titative Information. Graphics Press, 1999.

[47] Fan-Yin Tzeng and Kwan-Liu Ma. Opening theblack box – data driven visualization of neuralnetworks. In IEEE Visualization, 2005.

[48] Rosario Uceda-Sosa, Cindy X. Chen, and Ka-jal T. Claypool. CLOVE: A framework to designontology views. In ER, pages 844–849, 2004.

[49] Frank van Ham, Huub van de Wetering, andJarke J. van Wijk. Interactive visualization ofstate transition systems. IEEE Transactions onVisualization and Computer Graphics, 8(4):319–329, 2002.

[50] Martin Wattenberg. Visual exploration of multi-variate graphs. In ACM SIGCHI Conference onHuman Factors in Computing Systems, 2006.

[51] Ka-Ping Yee, Danyel Fisher, Rachna Dhamija,and Marti Hearst. Animated exploration of dy-namic graphs with radial layout. In IEEE Sym-posium on Information Visualization, 2001.

A Usability Survey

The following pages show the web survey that wasgiven to the users of the ontology browser.

32

Page 33: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

33

Page 34: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

34

Page 35: Visual Browsers for Biological Models, Simulation, and ...projectsweb.cs.washington.edu/research/projects/... · The challenges are in developing the controlled vo-cabulary and getting

35