service-enabling biomedical research enterprise

20
07/04/22 Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy

Upload: melody

Post on 13-Jan-2016

31 views

Category:

Documents


1 download

DESCRIPTION

Service-enabling Biomedical Research Enterprise. Chapter 5 B. Ramamurthy. Introduction. Life sciences have witnessed a flurry of innovations triggered by sequencing of human genome as well as genomes of other genomes. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Service-enabling Biomedical Research Enterprise

04/21/23Page 1

Service-enabling Biomedical Research Enterprise

Chapter 5B. Ramamurthy

Page 2: Service-enabling Biomedical Research Enterprise

04/21/23Page 2

Introduction

• Life sciences have witnessed a flurry of innovations triggered by sequencing of human genome as well as genomes of other genomes.

• Area of transformational medicine aims to improve communication between basic and clinical science to allow more therapeutic and diagnostic insights.

Page 3: Service-enabling Biomedical Research Enterprise

04/21/23Page 3

Translational medicine

• From bench to bedside• Exchange ideas, information and

knowledge across organizational, governance, socio-cultural, political and national boundaries.

• Currently mediated by the internet and exponentially-increasing resources

• Digital resources: scientific literature, experimental data, curated annotation (metadata) human and machine generated. Ex: Blast Searches NCBI taxonomy

Page 4: Service-enabling Biomedical Research Enterprise

04/21/23Page 4

Driving principles

• Key requirements: large volume of data to be managed. How?

• Transform to – Digital– Machine readable– Capable of being filtered– Aggregated– Transformed automatically– Context information: use and meaning along with content– Knowledge integration: combines data from research in

mouse genetics, cell bilogy, animal neuropsychology, protein biology, neuropathology, and other areas.

– Attention to drug discovery, systems bilogy and personalized medicine that rely heavily on integrating and interpreting data produced by experiments.

– Heterogenious data

Page 5: Service-enabling Biomedical Research Enterprise

04/21/23Page 5

BioSem Enterprise Architecture

Clinical dataEx: JNI

ResearchKnowledgeEx: Blast

Clinical experimentsEx: drug discovery

Transform resultsEx: integrate,

generate metadata

ontology

AcademicKnowledgeEx: cell,

psychologymolecular

search

Diagnostic tools

Treatmentmethods

DisseminationOf results

Page 6: Service-enabling Biomedical Research Enterprise

04/21/23Page 6

Use case

• Parkinson’s disease (PD): – System physiology perspective– Cellular and molecular biology perspective– Pharmacology relating to chemical

compounds that bind to receptors– Example query: show me the neuronal components that

bind to a ligand which is a therapeutic agent in Parkinson’s disease in reach of the dopaminergic neurons in the substania nigra.

– Domain specific shared semantics and classifications

– Ontologies can help map among the domains and support seamless integration and interoperation.

Page 7: Service-enabling Biomedical Research Enterprise

04/21/23Page 7

Development of Ontologies

• Manual interaction between ontologists in experts

• Textual descriptions are used for adding to this base

• Link pre-existing ontologies for extensive coverage

Page 8: Service-enabling Biomedical Research Enterprise

04/21/23Page 8

Ontology design and creation Approach (fig. 5.1)

Subject matter Knowledge (Text)

Identify core terms And phrases

Map phrases toRelationship between

classes

Model terms using ontologicalConstructs: classes, properties

Arrange classes and relationshipsin subsumption hierarchies

Informationqueries

Pre-existing classificationsAnd ontologies

Identify new classes andrelationships

Refine subsumptionhierarchies

Re-use classes and relationships

Extenf subsumption hierarchies

Page 9: Service-enabling Biomedical Research Enterprise

04/21/23Page 9

Identifying concepts and hierarchies

• Text describing PD in p.105• Study the analysis• Based on the analysis identify important

ontological concepts relevant to PD:– Genes– Proteins– Genetic mutations– Diseases

• See fig. 5.2• Next step is to identify relationship among

concepts

Page 10: Service-enabling Biomedical Research Enterprise

04/21/23Page 10

Identifying and extracting relationships

rdf:Resource

Disease Gene

owl:Thing

LewyBody

ParkinsonDisease

UCHL-1

Page 11: Service-enabling Biomedical Research Enterprise

04/21/23Page 11

Extending the ontology based on information queries

• Consider various queries and identify concepts and relationships needed to be part of PD ontology.

• These concepts are needed to retrieve information and knowledge from the system.

• This lead to additional new concepts. See fig.5.4

Page 12: Service-enabling Biomedical Research Enterprise

04/21/23Page 12

PD: adding concepts to support information queries

owl:Thing

rdfs:Resource

Pathway

AnatomicalEntity

Protein

Page 13: Service-enabling Biomedical Research Enterprise

04/21/23Page 13

Ontology Re-use

• It is desirable to re-use the ontology and vocabulary developed in the healthcare and life-sciences fields.

• Diseases: PD information can be used in Huntington’s and Alzeimer’s. PD can reuse information from International classification of diseases ICD and its subset SNOMED.

• Genes: more genes and genomic concepts such as proteins, pathways are added to ontologies. Consider connecting to Gene Ontology.

• Neurological concepts: Consider using Neuro names 2007.• Enzymes: concepts related to enzymes and other chemicals

may be required; you may use Enzyme Nomenclature 2007• Be aware of inconsistencies and circularities.• Multiple models may emerge; choice should be based on

use cases and functional requirements.

Page 14: Service-enabling Biomedical Research Enterprise

04/21/23Page 14

Data sources

• Now answering the question that we posted in slide#6, three data sources need to be integrated:

• Neuron database, PDSP KI database, PubChem

Page 15: Service-enabling Biomedical Research Enterprise

04/21/23Page 15

Data Integration

• A centralized approach where data available through web based interfaces is converted into RDF and stored in a centralized repository

• A federated approach where data continues to reside in the existing repositories. RDF mediator converts underlying data into RDF format.

• RDF allows for focus on logical structures of information in contrast to only representational format (XML) or storage format (relational).

Page 16: Service-enabling Biomedical Research Enterprise

04/21/23Page 16

Mapping ontological concepts to RDF graphs

• Sample query discussed earlier results in these concepts:– Compartment located_on Neuron– Receptor located_in Compartment– Ligand binds_to Receptor– Ligand associated_with Disease

• Next task to map these into RDF maps in the underlying data sources.

• Using ontological definitions, data sources, SPARQL queries, and name space, RDF graphs are extracted.

Page 17: Service-enabling Biomedical Research Enterprise

04/21/23Page 17

Generation and merging of RDF graphs

D_NeuronUR12

NeuronUR12

D_DendriteUR12

D1UR14

Located_in

Located_in

type

Neuron Database D1UR14

5-H TryptamineUR15

binds_to

PDSPKI Database

Parkinson’s diseaseUR16

5-H TryptamineUR15

associated_with

PubChem database

Page 18: Service-enabling Biomedical Research Enterprise

04/21/23Page 18

Integrated RDF graph

D_NeuronUR12

NeuronUR12

D_DendriteUR12

D1UR14Located_in

Located_in

type

Parkinson’s diseaseUR16

5-H TryptamineUR15

associated_with

binds_to

Page 19: Service-enabling Biomedical Research Enterprise

04/21/23Page 19

Exam question?

• Consider the PD case study that used ontological approach to querying distributed databases.

1. Discuss 10 reasons of using this approach as opposed to common SQL query and relational database approach.

2. Why is Google, Yahoo or MSN search not good enough for searching biological database?

3. Discuss centralized and federated approach to data integration in the context of this case study.

4. Submit a softcopy of the document in the digital drop box.

How to do this? Read Chapter 5, read it again. The answers can be formed from the information provided there and from your experience with relational database systems.

Page 20: Service-enabling Biomedical Research Enterprise

04/21/23Page 20

Summary

• Semantic web technologies provide an attractive technological informatics foundation for enabling the Bench to Bedside Vision.

• Many areas of biomedical research including drug discovery, systems biology, personalized medicine rely heavily on integrating and interpreting heterogeneous data set.

• This is part of ongoing work in the framework of the work being performed in the Healthcare and Life Sciences Interest Group of W3C.