stanford center for biomedical informatics research an ontology-based approach for computational...
TRANSCRIPT
Stanford Centerfor Biomedical Informatics Research
An Ontology-Based Approach for Computational Phenomics:
Application to Autism Spectrum Disorder
Amar K. Das, MD, PhDDepartments of Medicine and
of Psychiatry and Behavioral Sciences
NCBO WebinarOctober 7, 2009
Hasler G,et al. Toward constructing an endophenotype strategy for bipolar disorders. Biological Psychiatry (2006)
Represent findings and their links using structured knowledge
NCBO WebinarOctober 7, 2009
Phenomics
“A primary task for the new field of phenomics will be to clarify what, in practical terms, constitutes a phenotype and then to delineate the different phenotypic components that compose the phenome.”
Freimer & Sabatti, Nature Genetics (2003)
NCBO WebinarOctober 7, 2009
Current Approaches Lack of standardization Lack of organization Lack of computability
NCBO WebinarOctober 7, 2009
Autism DSM-IV DiagnosisA total of six (or more) items from (1), (2), and (3), with at least two from (1), and one each from (2) and (3)
(1) qualitative impairment in social interaction, as manifested by at least two of the following:a) marked impairments in the use of multiple nonverbal behaviors such as eye-to-eye gaze, facial expression, body posture, and gestures to regulate social interactionb) failure to develop peer relationships appropriate to developmental levelc) a lack of spontaneous seeking to share enjoyment, interests, or achievements with other people, (e.g., by a lack of showing, bringing, or pointing out objects of interest to other people)d) lack of social or emotional reciprocity
NCBO WebinarOctober 7, 2009
Autism DSM-IV Diagnosis(2) qualitative impairments in communication as manifested by at least one of the following:a) delay in, or total lack of, the development of spoken language (not accompanied by an attempt to compensate through alternative modes of communication such as gesture or mime)b) in individuals with adequate speech, marked impairment in the ability to initiate or sustain a conversation with othersc) stereotyped and repetitive use of language or idiosyncratic languaged) lack of varied, spontaneous make-believe play or social imitative play appropriate to developmental level
NCBO WebinarOctober 7, 2009
Autism DSM-IV Diagnosis(3) restricted repetitive and stereotyped patterns of behavior,interests and activities, as manifested by at least two of the following:a) encompassing preoccupation with one or more stereotyped and restricted patterns of interest that is abnormal either in intensity or focusb) apparently inflexible adherence to specific, nonfunctional routines or ritualsc) stereotyped and repetitive motor mannerisms (e.g hand or finger flapping or twisting, or complex whole body movements)d) persistent preoccupation with parts of objects
Delays or abnormal functioning in at least one of the following areas, with onset prior to age 3 years:(1) social interaction(2) language as used
NCBO WebinarOctober 7, 2009
Goals of NDAR Develop standards to promote meta-
analyses and cross site research data comparisons
Provide researchers access to useful software tools and infrastructure
Promote the sharing of research data relevant to ASD
NCBO WebinarOctober 7, 2009
NIH Research Support in Autism $100 million/year in funding
Investigator-initiated grants (R01’s) Special initiatives, e.g. RFA for genetics Centers and networks Training grants (To institutions and individuals)
New initiatives Intramural Research Program on Autism Autism Centers of Excellence (ACE) National Database for Autism Research (NDAR) ARRA stimulus program
NCBO WebinarOctober 7, 2009
Query and Reporting
BIRN Services& Resources
NDAR System
Security
Portal
Grid Computing
Collaboration
Data Storage Management
Data Integration Tools
AuditingUser Management
Subject Tracking & Management
Clinical Assessments(OpenClinica)
Common Measures
Study Management
Neuroimaging
Image Analysis
Image Processing
Image data access
Genomics
Genomics data access
Data Integration
NCBO WebinarOctober 7, 2009
Phenotypes in Psychiatry
‘The observable structural and functional characteristics of an organism determined by its genotype and modulated by its environment’
Diagnostic component Intermediate phenotype Quantitative phenotype Covariates
NCBO WebinarOctober 7, 2009
Example Query #1
Find all subject who are verbal (ADIR A14). Then look at their IQ (Cognitive Total IQ > 70) and whether or not they have seizures (Medical History Q10). Also find out if they have an abnormal MRI or any genetic abnormalities.
NCBO WebinarOctober 7, 2009
Example Query #2
Use head circumference to categorize macroencephaly. Then see if the subjects differ in their ADOS, ADI-R, cognitive, and language profiles, and combine this with genetic data.
NCBO WebinarOctober 7, 2009
NDAR Project Systematic Review Ontology Development Database Infrastructure
NCBO WebinarOctober 7, 2009
Systematic Review “(ADI-R or ADOS or Vineland) and
(genes or genetics) and autism” 26/43 papers relevant Mean # phenotypes 4.1, range 1-13 Three basic types (1:1, sum, cutoff score)
Tu, S. W. AMIA Annual Proceedings (2008)
NCBO WebinarOctober 7, 2009
Systematic Review Different terms
e.g., ‘age of first phrases’ and ‘age of onset of phrase speech’
Different cutoff scorese.g., ‘delayed word’
Different definitionse.g., ‘regression’e.g., use of different instruments
NCBO WebinarOctober 7, 2009
Clinical Research Study
Clinical Trial StudyCase Study
Controlled Case Study Study Arms
Ontology A taxonomy with multiple link types,
each with precise meaning
NCBO WebinarOctober 7, 2009
Perspectives on ‘Ontology’ Philosophy: The study
of what entities and what types of entities exist in reality
Computer Science: A schema that represents a domain and is used to reason about the objects in that domain and the relations between them
NCBO WebinarOctober 7, 2009
Critical to the ‘Semantic Web’ Shared research and development plan to
Provide explicit semantic meaning to data and knowledge shared on the Web
Bring structure to Web content Advance the current state-of-the-art in Web
information retrieval, which is keyword searching
Distributed applications will be able to process data and knowledge automatically through the use of ontologies
NCBO WebinarOctober 7, 2009
OWL: Web Ontology Language Advances current Semantic Web standards
by using ontologies to represent knowledge OWL can be used to build ontologies of
high-level descriptions, based on three concepts: Classes (e.g., Subject, Phenotype, Genotype) Properties (e.g., isBearerOf, hasResults) Individuals (e.g., “Macroencephaly”)
NCBO WebinarOctober 7, 2009
SubjectGenotype
Phenotype
mutIn-RELN
Macro-encephaly
011451
hasResult
isBearerOf
OWL: Web Ontology Language
NCBO WebinarOctober 7, 2009
BIRNLex A controlled terminology for annotation of
BIRN data sources, focusing on imaging data from human subjects and mouse models
Terms cover neuroanatomy, molecular species, behavioral and cognitive processes, subject information, experimental practice and design
NCBO WebinarOctober 7, 2009
Basic Formal Ontology An upper ontology which can be used
to support the development of domain ontologies used in scientific research
All concepts are subclasses of Continuants: exists in full at any time in
which it exists at all Occurants: has temporal parts and that
happens, unfolds or develops through time
NCBO WebinarOctober 7, 2009
OBO Foundry Ontologies should be orthogonal
Minimize overlap Each distinct entity type (universal) should
only be represented once Partition efforts in the OBO Foundry
rationally to help organize and coordinate the ontology development
NCBO WebinarOctober 7, 2009
CONTINUANT OCCURRENT RELATION TO
TIME GRANULARITY INDEPENDENT DEPENDENT
ORGAN AND ORGANISM
Organism (NCBI
Taxonomy)
Anatomical Entity (FMA, CARO)
Organ Function (FMP, CPRO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell (CL)
Cellular Component (FMA,GO)
Cellular Function
(GO)
Phenotypic Quality (PaTO)
Cellular Process (GO)
MOLECULE Molecule
(ChEBI, SO, RnaO, PrO)
Molecular Function (GO)
Molecular Process (GO)
Chris Mungall, PATO
NCBO WebinarOctober 7, 2009
SWRL: Semantic Web Rule Language W3C specification for expressing
logical rules that can be formulated in terms of OWL concepts
Rules in SWRL can be used to deduce new knowledge about an existing OWL ontology
Specification can be extended through the use of built ins
NCBO WebinarOctober 7, 2009
hasParent(?x, ?y) ^ hasBrother(?y, ?z)→ hasUncle(?x, ?z)
Example SWRL Rule: hasUncle
NCBO WebinarOctober 7, 2009
Example SWRL Rule: hasSister
Person(Amar) ^ hasSibling(Amar, ?s)
^ Woman(?s)
→ hasSister(Amar, ?s)
NCBO WebinarOctober 7, 2009
Person(?p) ^ hasAge(?p,?age) ^ swrlb:lessThan(?age,17) → Child(?p)
Example SWRL Rule: Child
NCBO WebinarOctober 7, 2009
Rule-Based Methods Extensions to SWRL
Temporal Library of temporal built ins
Query Extraction of results as a table
MakeSet Support for set-based operations
NCBO WebinarOctober 7, 2009
Development Methods Extensions to BIRNLex Encoding of phenotypes Querying of NDAR database
NCBO WebinarOctober 7, 2009
Autism Assessment Result
Figure 1. The representation of data collected through the ADI-2003 autism assessment instrument as part of the autism ontology.
NCBO WebinarOctober 7, 2009
Phenotype Representation
Figure 2. The representation of the Status of age of words phentotype group as a OWL class partition by the possible statuses.
NCBO WebinarOctober 7, 2009
Phenotype Rule
ADI_2003_result(?assessment) ^
acqorlossoflang_aword(?assessment,?wordage) ^
swrlb:greaterThan(?wordage, 24) ^
subject_id(?assessment, ?subjectId) ^
orgtax:Human(?subject) ^
subject_id(?subject, ?subjectId)
→ birn_obo_ubo:bearer_of(?subject, Delayed_word)
NCBO WebinarOctober 7, 2009
Phenologue Project Develop an ontology of endophenotypes that maps brain
connectivity, neural deficits, and genetic markers into a subject domain theory
Develop logic-based methods to encode and classify endophenotypes based on multi-scale measurements
Create tools to acquire new endophenotypes and annotate phenotype-genotype findings in online resources such as published literature
Develop query-elicitation methods that can evaluate hypotheses about the subject domain theory of endophenotypes using deductive inference
NCBO WebinarOctober 7, 2009
Phenologue Project
Database
Phenotype Definitions
New Associations
Query
Catalog Analysis
NCBO WebinarOctober 7, 2009
Rule Technologies Rule paraphrasing Rule elicitation Rulebase visualization Knowledge mining using rules
NCBO WebinarOctober 7, 2009
Computational Phenomics Informatics methods to support
phenomics Apply machine learning methods to
discover groups of rules with common semantics
Use natural language processing method to discover phenotype rules in published text
NCBO WebinarOctober 7, 2009
Future Directions Expand phenotype categories Use natural language processing
method to discover phenotype rules in published text
Apply machine learning methods to discover groups of rules with common semantics
NCBO WebinarOctober 7, 2009
Summary The development of a standardized,
organized, and computable set of phenotype terms is central to etiologic studies of complex disorders
The use of ontologies and rules to model phenotypes is feasible and can enable automated discovery of new phenotype-genotype relationships
NCBO WebinarOctober 7, 2009
Acknowledgments Stanford Group
Martin O’Connor Saeed Hassanpour Duriel Hardy Ravi Shankar Lakshika Tennakoon Samson Tu
National Center for Biomedical Ontology Mark Musen Daniel Rubin
NDAR/NIMH Lynn Young Matthew McAuliffe Dan Hall Lisa Gilotty
Biomedical Informatics Research Network Bill Bug Maryann Martone