ontologies for big data

22
ONTOLOGIES FOR BIG DATA Asiyah Yu Lin, M.D., M.S,. Ph.D.

Upload: yu-lin

Post on 07-Feb-2017

339 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Page 1: Ontologies for big data

ONTOLOGIES FOR BIG DATA

Asiyah Yu Lin, M.D., M.S,. Ph.D.

Page 2: Ontologies for big data

My profile

<location>USA</location><work>postdoctoral training</work><work:company_type>University</work:has_degree><work:has_title>research fellow</work:has_title><bioinformatics>ontology development</bioinformatics><bioinformatics>social network analysis</bioinformatics><bioinformatics>ontology applying data analysis</bioinformatics>

Postdoc training:Ontologies

<location>Japan</location><work>institution</work><work:company_type>non profit organization</work:has_degree><work:has_title>bioinformatician</work:has_title><bioinformatics>454sequence assembly</bioinformatics><bioinformatics>non model organism sequence analysis</bioinformaticsl>

Bioinformatician:

NGS

<location>Japan</location><education:has_degree>Ph.D</education:has_degree><education:has_major>medical informatics</education:has_major><bioinformatics>ontology</bioinformatics><bioinformatics>data integration</bioinformatics><bioinformatics>biological pathway analysis</bioinformatics>

Ph.D. in Medical

Informatics<location>China</location><work>industry</work><work:company_type>start_up IT </work:has_degree><work:has_title>content manager</work:has_title><work:has_title>project manager</work:has_title><IT_skill>web site building</IT_skill><IT_skill>relational database </IT_skill>

Content Manager &

Project Manager

<location>China</location><education>Medical School </education><education:has_degree>master</education:has_degree><education:has_major>molecular immunology</education:has_major><bioinformatics>sequencing</bioinformatics><bioinformatics>protein 3D simulation</bioinformatics>

Master in Molecular

Immunology<location>China</location><education>Medical School </education><education:has_degree>bachelor</education:has_degree><education:has_major>Pediatrics</education:has_major>

M.D. in Pediatrics

Page 3: Ontologies for big data

Agenda

Introduction : ontologies, semantic web and big data Selected projects:

1. Informed Consent Ontology (ICO) 2. miRNA and Aging Ontology (MIAGO) 3. Ontology of Drug Neuropathy Adverse Event (ODNAE) 4. LINCS-BD2K 5. mebdo (Medicare and Census big data project)

SOCR Data Dashboard Conclusion

Page 4: Ontologies for big data

Ontologies, form of knowledge representation, the structural frameworks for organizing terms hierarchically and defining relations between terms within a domain

1. A hierarchical vocabulary, class-subclass-instance2. Defined relations between terms to interlink the whole system3. Constrains and logical definitions4. Explicit specification of a conceptualization (Gruber,1993)

What is ontology?

Page 5: Ontologies for big data

Why ontology ?

Knowledge management• RDF, RDFS, OWLNatural language processing• Linguistic ontology: WordNet

E-commerceIntelligent information integrationKnowledge acquisition and discoveryDatabase design and integrationMedical decision making agentLinked Open Data, Semantic Web

Page 6: Ontologies for big data

Semantic Web Layer CakeRDF: simple triples, graph-based queries, supports very large amount of data Bill –has_address- Location A

OWL: significantly more expressive language, strong axioms, inference capabilities, consistency verification, but can be rather slowBill –has_address- Location A Location A –is_address_of- Bill

Inverse relation

Page 7: Ontologies for big data

SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysis Ontology of Drug Neuropathy Adverse Event 4. LINCS-BD2K5. mebdo (Medicare and Census big data project)

SOCR Data Dashboard

Page 8: Ontologies for big data

Informed Consent Ontology (ICO)

ICBO 2014 poster

Page 9: Ontologies for big data

SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysis Ontology of Drug Neuropathy Adverse Event (ODNAE)4. LINCS-BD2K5. mebdo (Medicare and Census big data project)

SOCR Data Dashboard

Page 10: Ontologies for big data

The power of reasoningmiRNA and Aging Ontology (MIAGO)

Database (in revision)

Page 11: Ontologies for big data

SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysisOntology of Drug Neuropathy Adverse Event (ODNAE)4. LINCS-BD2K5. mebdo (Medicare and Census big data project)

SOCR Data Dashboard

Page 12: Ontologies for big data

drug-associated neuropathy AE

(ODNAE)

drug administration(OAE_0000011)

a drug(DrON, linked to

RxNORM, NDFRT)

preceded_by

chemical element(ChEBI)

has_proper_part

biological process (GO)

drug role in mechanism of

action (NDFRT)

has_role

is_realized

_in

human(NCBITaxon_9606)

occurs in

has participant

a quality (e.g., age)(PATO)

has_quality

has participant

neuropathy AE(OAE_0000418) is_a

bupropion (Aplezin, Wellbutrin, Zyban,

Budeprion SR, Buproban, Forfivo

XL)-associated neuropathy AE

(ODNAE_0000043)

drug administration(OAE_0000011)

Bupropion Oral Tablet

(DRON_00026665)

preceded_by

bupropion(CHEBI_3219)

has_proper_part

negative regulation of dopamine uptake

(GO_0051585)

has_specified_input

Dopamine Uptake Inhibitors [MoA] (N0000000114)

has_role

is_realized

_in

human(NCBITaxon_9606)

occurs in

has participant

age(PATO_0000011)

has_quality

has participant

neuropathy AE(OAE_0000418)

is_a

(A)

(B)

drug product(DrON_00000005)is_a

has_specified_input

drug product(DrON_00000005)is_a

negative regulation of neurotransmitter

uptake (GO_0051581)

is_a

ODNAE:Linking knowledge together

Page 13: Ontologies for big data

ODNAE results: 215 neuropathy AE drugs knowledge base

related AEs and 20 AE types

(A) (B)

127127

11887

1161

132096153911

217

14

1

related chemical compounds

139 Mode of Action ICBO 2015 VDOS workshop

Page 14: Ontologies for big data

What’s missing in ODNAE Only 13 GO biological processes were mapped to some MoA. Holistic analytic methods are needed to understand the mechanism.

We need more…

Page 15: Ontologies for big data

1. LINCS-BD2K

2. SCOR DASHBOARD

Page 16: Ontologies for big data

University of Miami Computational LINCS Center LINCS Data Coordinating Center http://lifeKB.org

BD2K LINCS Data Coordination and Integration Center http://lincs-dcic.org/

NIH LINCS Program

16

Library of Integrated Network-based Cellular Signatures

Page 17: Ontologies for big data

Drug and Gene Knockdown Followed by Genome-Wide Expression

KO and Mutant Genes and their Disease Phenotypes

Drug and Knockdown Effects on Cell Viability

Transcription Factors and Histone Modifications Profiled by ChIP-Seq

Protein-Protein Interactions and Cell- or Metabolic-Pathways

Gene Expression from Patient Cohorts with Genomics and Clinical Outcome

Data

Drugs and Toxic Chemicals that Cause Adverse Events

Networks

Bi-partite Graphs

Gene-Set Libraries

Hierarchical Trees

Page 18: Ontologies for big data

Drugs Side Effects

Genes

Diseases

ProteinsSignatures

PatientTumors

Cancer Cell Lines

Tissues Mutations

MousePhenotypes

Bi-Partite Relationships Between Data Types

Page 19: Ontologies for big data

Data integration and systems modeling

19

Page 20: Ontologies for big data

SOCR Analytics DashboardStatistics Online Computational Resource

Provide graphical querying, navigating and exploring the multivariate associations in complex heterogeneous datasets.

Integrate dispersed multi-source data and service the mashed information via human and machine interactions in a secure, scalable manner.

http://socr.umich.edu/HTML5/Dashboard/

Page 21: Ontologies for big data
Page 22: Ontologies for big data

1. Ontologies are important components for Big Data integration and manipulation.2. Reuse ontologies will enable seamless integration with other resources.3. However, ontologies can not solve all the problems in biomedical world; they are tools to support science.4. Formalized ontologies can be used by humans and automated systems as a basis for communication and data exchange (such as RDF data)5. Ontologies based application may go beyond reasoning alone and use statistical analyses (enrichment), semantic similarity, network analysis, graph algorithms, clustering, etc.6. Many more to explore in the big-data era.

Conclusion: