drug target prediction using semantic linked data

Post on 12-May-2015

720 Views

Category:

Health & Medicine

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

beyond data integration, what else can we do? this presentation shows how we use semantic linked data for drug target prediction.

TRANSCRIPT

DRUG TARGET PREDICTION USING SEMANTIC LINKED DATA

Bin Chen Ph.D. candidate School of Informatics and Computing, Indiana University

binchen@indiana.edu http://cheminfo.informatics.indiana.edu/~binchen

CSHALS, Feb 23, 2012

Beyond Data Integration

Chem2Bio2RDF.ORG

What is Drug Target?

Why Drug Target Prediction?

How to predict drug target?

?

• Substructure• Side effect• Chemical ontology• Gene expression profile

Drug 1 Target 1

bind

Drug 2From Ligand (drug) perspective

?

• Sequence• 3D structure• Gene Ontology• Ligand

Drug 1 Target 1

bind

Target 2

From target perspective

Troglitazone

PPARG

ACSL4

bind

PPARA

bind

bind

Rosiglitazone Pioglitazo

ne

hypoglycemic drug

Chemical ontology

Chemical ontology

Chemical ontology

bindbind

PPAR signaling pathway

Eicosapentaenoic Acid

Response to nutrient

pathway

bind

bind

GO

GO

pathway

troglitazone

troglitazone

Semantic Linked Network

Topology is important for association

Cmpd 1Protein

1 Cmpd 2

Cmpd 1Protein

1 Cmpd 2

hasSubstructure

hasSubstructure

hasSubstructure

hasSubstructure

bind

bind

Protein2

Cmpd1

Cmpd 2Protein

1

Protein2

Cmpd1Protein

1

hasGOhasGO

Protein2

Cmpd1Protein

1bind PPI

Cmpd1Protein

1Cmpd 2 bindhasSideeffect hasSide ffect

Cmpd1Protein

1Cmpd 2 bindhasSubstructure hasSubstructure

Semantic is important for association

GO:00001

hypertension

substructure1

bind

bind bind bind

(Semantic Link Association Prediction)

Data schema

Path finding

Drug: Troglitazone

Target :PPARG>300k nodes, >1million edges

Statistical Model---edge weight

Target I

Target JPPI

PPI

PPI

hasGOhasGO

hasPathway

bind

P(i j) =1/3

Statistical Model---path raw score

Target2

Cmpd1 (s)

Target

1 (t)

e1 e2e4 e3

Protein2

Cmpd1

Cmpd1

Protein1

Path pattern

bind bind bind

Protein

Cmpd

Cmpd

Protein

bind bind bind

Protein1

Cmpd3

Cmpd4

Protein5

bind bind bind

Path examples:

Path pattern:

• Randomly sample 100,000 drug target pairs, yielding 453,087 paths, 35 patterns

• Plot Path pattern score distribution

• Convert path score to path z score

Path Pattern Score Distribution

Statistical Model---path z score

Statistical Model---association score

Fit association scores of random pairs to normal distribution

Association Score distribution among different pairs

Comparing SLAP with link prediction methods in social science

ROC curve

AUC=0.92

Drug polypharmacology profiles

Polypharmacology profile comprises of association scores of one drug against over one thousand targets

Moexipril

Rescinnamine

Benazepril

Quinapril

Fosinopril

Trandolapril

Enalapril

Chlorthalidone

Metolazone

Trichlormethiazide

Betaxolol

Alprenolol

Acebutolol

Nadolol

Bisoprolol

Drug similarity network

• Nodes present drugs

• Two nodes are linked if they are similar in terms of biological function.

• Nodes are colored by their therapeutic indications

Insomnia related drugs

Dissimilar Drugs have same indication

Drug repurposing

Anti-Parkinson

allergic rhinitis

http://www.ebi.ac.uk/chebi/searchId.do?chebiId=3398

?

Summary

Semantic Link association can be assessed by topology and semantics of the network

Domain knowledge plays an important role!

Team

Prof. Ying Ding Prof. David Wild

http://chem2bio2rdf.org/slap

Thanks!

Backup slides

SLAP Pipeline

Path filtering

Chen, B., Dong. X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics, 2010, 11:255

Chem2Bio2RDF Datasets

Chem2Bio2RDF data

Other data venderscompoundprotein/genechemogenomicsliteratureothers

http://chem2bio2rdf.org

Ranking Target associated chemicals

Randomly select three targets Select all target associated

chemicals as positive link Randomly select equal number of

chemicals that are not associated with the target

Compare with Naïve bayes using Molecular Weight, ALogP, number of hydrogen bond acceptors and donors, the number of rotatable bonds and FCFP_6 as descriptors

Leave one out validation

Two objects are related if they are related to same objects

Coauthorship

Same Target

Two objects are related if their related objects are related

Similar Drugs have distinct indications

Levodopa: dopaminergic agentAnti-parkinson drug

Methyldopa : antiadrenergicAntihypertensive drug

Slap similarity: p value>0.05Tanimoto coefficient=0.89

Direct: drug target interacts with each other physicallyIndirect: indirect interaction (e.g., change gene expression)Random: random drug target pairs

Association Score distribution among different pairs

top related