Transcript
Page 1: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[1]

Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the

chemistry and protein levels:

Chris Southan, Curator for IUPHAR-db/Guide toPHARMACOLOGY

Presented at the IUPHAR Meeting, Paris, October 2013

The presentation is based on this paper

http://onlinelibrary.wiley.com/doi/10.1002/minf.201300103/abstract

Page 2: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[2]

Page 3: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[3]

So why compare databases?

• Determine what they actually contain• Determine extraction selectivity • Retro-divine the curatorial rules• Assess relationship mapping stringency and fidelity• Understand entity, attribute and relationship distributions• Become aware of declared or cryptic circularity • Detect global error propagation• Evaluate unique content• Judge utility, complementarity, consumability and integratablity• See what to avoid in your own database• Know what to emulate

Page 4: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[4]

Introducing the resources (I)

• ~ 2/3 of ChEMBL is curated from medicinal chemistry papers, mainly as structure-activity-relationship (SAR) results. The other ~ 1/3 is from confirmatory PubChem BioAssays. Release 15 (January 2013) 9,570 targets, 1,254,575 distinct compounds, 10,509,572 activities and 48,735 publications

• DrugBank collates target and mechanism-of-action information. Version 3.0 (January 2011) contains 6,715 drug entries including 1,452 FDA-approved small molecules, 131 biologicals, 86 nutraceuticals and 5,076 experimental compounds. These are mapped to 4,233 protein IDs.

• TTD is conceptually similar to DrugBank but the compound-to-target mappings are focussed on primary targets. It ncludes a three-way split of targets and compounds into marketed, clinical trial and research phase. The latest version 4.3.02 (August 2011) includes 2,025 targets, 17,816 chemical structures, including 1,540 approved drugs.

Page 5: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[5]

Introducing the resources (II)

• HMDB collates detailed chemical, clinical and biochemical data on human metabolites. These are linked to other databases including enzymes involved in the transformations. Version 3.0 (September 2012) contains 40,437 chemical entries and 5,650 protein sequence identifiers.

• IUPHARdb/GTP (you are hearing about this weekend….) 6064 ligands, 559 approved drugs , 894 Unique (UniProt) targets with direct activity mappings (all ligand types, all species, but excluding kinase screens) 21,774 references

Page 6: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[6]

Chemistry comparisons

Page 7: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[7]

Chemistry Mw vs over time

Page 8: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[8]

Resolving content inside PubChem (slice ‘n dice)

ChEMBL

ChEMBL DrugBank

HMDB TTD

Page 9: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[9]

Comparative protein attributes (e.g. GO)

IUPHARdb

HMDB

Page 10: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[10]

Comparing UniProt cross-references

Page 11: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[11]

Compare individual curatorial errors

Page 12: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[12]

Protein ID comparisonsConsensi are corroborative (but beware of curatorial circularity)

Human Swiss-Prot IDs

Page 13: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[13]

Differences in curation rules (e.g. atorvastatin)

Page 14: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[14]

Check for false-negatives

Page 15: Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

[15]

Thanks, questions welcome

Our 2012 paper (but the data was 2010) Mapping between databases of compounds and protein targets Muresan S, Sitzmann M, Southan C. Methods Mol Biol. 2012;910, PMID:22821596

Now on http://figshare.com/articles/Mapping_Between_Databases_of_Compounds_and_Protein_Targets/818979

If enjoyed this presentation, you might also like PMID: 20298516


Top Related