acs san francisco 2010 cinf talk

46
NCI/CADD: Open-access chemical structure web platform NCI/CADD: Open-access chemical structure web platform Markus Sitzmann 1 , Wolf-Dietrich Ihlenfeldt 2 , and Marc C. Nicklaus 1 [1] Computer-Aided Drug Design Group, Chemical Biology Laboratory, NCI-Frederick, NIH, DHHS [2] Xemistry GmbH, Auf den Stieden 8, D-35094 Lahntal, Germany

Upload: markus-sitzmann

Post on 17-May-2015

1.443 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD: Open-access chemical structure web platform

Markus Sitzmann1, Wolf-Dietrich Ihlenfeldt2, and Marc C. Nicklaus1

[1] Computer-Aided Drug Design Group, Chemical Biology Laboratory,NCI-Frederick, NIH, DHHS[2] Xemistry GmbH, Auf den Stieden 8, D-35094 Lahntal, Germany

Page 2: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Public Web Services

Enhanced NCI Database Browserhttp://cactus.nci.nih.gov/ncidb2

web service for NCI/DTP’s Open NCI Database

• first release 1998, updated 2001• ~250,000 structure records• ~60 million data points

Chemical Structure Lookup Servicehttp://cactus.nci.nih.gov/lookup

• first release 2006, updated 2008• ~74 million structure records

(~46 million unique structures)

structure lookup in over 100 database

Page 3: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Public Web Services

OSRA

http://cactus.nci.nih.gov/osra/

converts graphical representations of chemical structures injournal articles, patent documents, textbooks, trade magazines etc., into SMILES

Online SMILES Translatorhttp://cactus.nci.nih.gov/translate/

GIF Creator for Chemical Structureshttp://cactus.nci.nih.gov/gifcreator/

PROSIT: Online Pseudorotation Tool Version 2http://cactus.nci.nih.gov/prosit/

Page 4: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov

Page 5: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

New Web Services

Page 6: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

chemical structure

Chemical Structure Representations

NCI/CADD Identifiers

InChI/InChIKey

ChemSpider ID

PubChem SID/CID

chemical names

CAS Registry Number

NSC number

FDA UNII

ChemNavigator SID

SMILES

SD File

Chemical FormulaChEBI ID

PDB Ligand ID

MRV

CML

SYBYL Line Notation

GIF image

Page 7: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/chemical/structure

Works as a resolver for different chemical structure identifiers. Allows one to convert a givenstructure identifier into anotherrepresentation or structureidentifier.

Chemical Identifier ResolverNCI/CADD Web Resources

Page 8: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/chemical/structure

first beta release: July 2009second beta release: Nov. 2009third beta release: April/May 2010(beta versions will continue through 2010)

3.0 million requests since July 1, 2009(~11.000/day)

Chemical Identifier ResolverNCI/CADD Web Resources

Page 9: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• it is usable by a simple URL API:

example: http://cactus.nci.nih.gov/chemical/structure/Tamiflu/cas

204255-11-8

http://cactus.nci.nih.gov/chemical/structure/”identifier”/”representation”

MIME type: text/plain

Chemical Identifier ResolverNCI/CADD Web Resources

XML format: http://cactus.nci.nih.gov/chemical/structure/”identifier”/”representation”/xml

• if a request is not resolvable: HTTP404 status message

Page 10: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

identifier representation

http request

http response

detection ofthe identifier

type

identifier is afull structure representation

(e.g. SMILES, InChI)

calculation of therequested structure

representation

identifier is ahashed structure

representation(e.g. InChIKey),chemical name

etc.

database lookup

MIME type

Chemical Identifier ResolverNCI/CADD Web Resources

structure

e.g. InChI, GIF image

e.g. CAS number,chemical name

Page 11: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

“Chemical Structure Web Engine”

Chemical Structure Web Engine

NCI/CADDweb service

NCI/CADDweb service

NCI/CADD Chemical StructureDatabase (CSDB)

CACTVS

externalweb services

http

ChemicalIdentifierResolver

othersoftwarepackages

Page 12: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• number of structure records: 103.9 million• number of unique structures:

Std. InChIKey : ~73.0 million

FICuS : ~70.6 million uuuuu : ~65.3 million

• from the set of ~83.6 million unique structures we havederived about ~10 million additional scaffold-type structures (for future structure searches); thus:

• for lookup “identifier structure” available: ~92.9 million Standard InChIKeys ~93.3 million NCI/CADD Identifiers ~70 million chemical names linked to ~16 million structures

}union set of unique structures: ~83.6 million

Chemical Structure DatabaseNCI/CADD Web Resources

Page 13: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• ChemNavigator iResearch Librarycompilation of commercially availablescreening compounds from ~300 inter-national chemistry suppliers

• PubChem databaseincluding Open NCI database, EPA DSSTox databases, NIAID HIVdatabases, NIST Webbook, NLM ChemIDplus, ChemSpider …

• Commercial Sources / othersAsinex, Comgenex, … as of March 2010:

140 chemical structure databases103.9 million structure records

~70.6 million unique structures by FICuS

ChemNav.iResearch Lib.~56%

PubChem~38%

others

~6%

Chemical Structure DatabaseNCI/CADD Web Resources

Page 14: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• based on hashcodes calculated by the chemoinformatics toolkit CACTVS

• CACTVS hashcodes: represent a chemical structure uniquely as

16-digit hexadecimal number (64-bit unsigned) have a high sensitivity to structural features of a

compound change if connectivity changes

NCI/CADD Structure IdentifiersUnique Representation of Chemical Structures

HNN NH2

OH

O

9850FD9F9E2B4E25

Page 15: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

charged form

A3DAE0788050DDE4 3ECEF579D7DF025A

tautomers

isotope“errors”

E92E4BA2869F36118A7AD1EB498CC76Astereoisomers6C16DE2351F9FF50

HNN NH2

OH

O

NNH NH2

OH

O

HNN

OH

O

NH2

HNN

OH

O

NH2

salt

HNN NH2

O-

ONa+

HNN NH3

+O-

O

8F7A1DE5A733F0E0

O

HNN NH2

ONa

60525E1AF41497B6

HNN NH

OH

O

B2FDA68AEDA06DB9

NHN 15NH2

OH

O

9850FD9F9E2B4E25

Page 16: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

inputstructure

MDL MolfileMDL SDFSMILESChemDraw cdxPDB

structurenormalization

parentstructure

MDL SDFSMILESdatabase

NCI/CADDIdentifier

hashcodecalculation

NCI/CADD Structure IdentifiersUnique Representation of Chemical Structures

E_HASHISY

Page 17: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• adjustable levels of sensitivity:

NCI/CADD Structure Identifiers

Fragments

sensitive

keep only largestorganic fragment

Isotopes

ignoreisotope labels

sensitive

D

D

D

D

D

D

Charges

uncharge

sensitive

find canonicaltautomer

O O

Stereochemistry

sensitive

COOH

NH2

discard stereoinformation

O-

O

NH3+

OH

O

NH2

un-sensitive un-sensitive un-sensitive un-sensitive

sensitive

O OH

O OH

Tautomers

COOH

HNH2

COOH

NH2

HNa+

O

O-

O

OH

Structure Normalization

un-sensitive

Page 18: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Structure Identifiers

Fragments Isotopes Charges

sensitive

sensitive

sensitive

D

D

D

D

D

D

O OCOOH

NH2

un-sensitive un-sensitive un-sensitive un-sensitive

O-

O

NH3+

OH

O

NH2

Tautomers Stereochemistry

sensitive

sensitive

O OH

O OH

COOH

HNH2

COOH

NH2

HNa+

O

O-

O

OH

Structure Normalization

Page 19: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Structure Identifiers

Fragments Isotopes Charges

sensitive

sensitive

sensitive

D

D

D

D

D

D

O OCOOH

NH2

FF II CC

FICTS identifier: representation of the exact drawing

un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive

TT

O-

O

NH3+

OH

O

NH2

≠ ≠ ≠

Tautomers Stereochemistry

sensitive

sensitive

O OH

O OH

COOH

HNH2

COOH

NH2

H

SS

Na+

O

O-

O

OH

=

=

Structure Normalization

Page 20: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Structure Identifiers

Fragments Isotopes Charges

sensitive

sensitive

sensitive

D

D

D

D

D

D

O OCOOH

NH2

FF II CC

FICuS identifier: comes closest to how a chemist perceives a compound

un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive

uu

O-

O

NH3+

OH

O

NH2

≠≠ ≠ ≠

Tautomers Stereochemistry

sensitive

sensitive

O OH

O OH

COOH

HNH2

COOH

NH2

H=

= ≠

SS

Na+

O

O-

O

OH

Structure Normalization

Page 21: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Structure Identifier

Fragments Isotopes Charges Tautomers Stereochemistry

Na+

sensitive

sensitive

sensitive

sensitive

sensitive

O

O-

D

D

D

D

D

D

O-

O

NH3+

O OH

O OH

COOH

HNH2

COOH

NH2

H

O

OH

O OCOOH

NH2OH

O

NH2

=

=== = = =

=

uuuuu identifier: closely related forms of the same compound

uu uuuuuuuu

un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive

Structure Normalization

Page 22: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

A3DAE0788050DDE4-FICTS E5F83F10C5DB080A-FICTS

B2FDA68AEDA06DB9-FICTS

9850FD9F9E2B4E25-FICTS

E5F83F10C5DB080A-FICTS

E92E4BA2869F3611-FICTS8A7AD1EB498CC76A-FICTS6C16DE2351F9FF50-FICTS

HNN NH2

OH

O

NNH NH2

OH

O

HNN

OH

O

NH2

HNN

OH

O

NH2

HNN NH2

O-

ONa+

HNN NH3

+O-

O

O

HNN NH2

ONa

HNN NH

OH

ONH

N 15NH2

OH

O

9850FD9F9E2B4E25-FICTS

charged form

tautomers

isotope

salt

stereoisomers

FICTS

“errors”

Page 23: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

A3DAE0788050DDE4-FICuS E5F83F10C5DB080A-FICuS

B2FDA68AEDA06DB9-FICuS

9850FD9F9E2B4E25-FICuS

E5F83F10C5DB080A-FICuS

E92E4BA2869F3611-FICuS8A7AD1EB498CC76A-FICuS9850FD9F9E2B4E25-FICuS

HNN NH2

OH

O

NNH NH2

OH

O

HNN

OH

O

NH2

HNN

OH

O

NH2

HNN NH2

O-

ONa+

HNN NH3

+O-

O

O

HNN NH2

ONa

HNN NH

OH

ONH

N 15NH2

OH

O

9850FD9F9E2B4E25-FICuS

charged form

tautomers

isotope

salt

stereoisomers

FICuS

“errors”

Page 24: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

9850FD9F9E2B4E25-uuuuu9850FD9F9E2B4E25-uuuuu

9850FD9F9E2B4E25-uuuuu

9850FD9F9E2B4E25-FICuS

9850FD9F9E2B4E25-uuuuu

9850FD9F9E2B4E25-uuuuu9850FD9F9E2B4E25-uuuuu9850FD9F9E2B4E25-uuuuu

HNN NH2

OH

O

NNH NH2

OH

O

HNN

OH

O

NH2

HNN

OH

O

NH2

HNN NH2

O-

ONa+

HNN NH3

+O-

O

O

HNN NH2

ONa

HNN NH

OH

ONH

N 15NH2

OH

O

9850FD9F9E2B4E25-uuuuu

charged form

tautomers

isotope

stereoisomers

salt

uuuuu

“errors”

Page 25: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

NCI/CADD Chemical Structure Database

NCI/CADD:RID NCI/CADD:CID

structure records compounds(structures unique by

CACTVS HASHISY)

FICTS associations~72.0 million

FICuS associations~70.6 million

uuuuu associations~65.3 million

103.5 million 83.6 million

~130 millionlinkouts to

originaldatabase

records

linked to:• StdInChI[Key]• chemical names• chemical formula• properties• etc.

Page 26: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

resolver

chemical namesCAS numbers

SMILES stringsIUPAC

InChI/InChIKeysNCI/CADD Identifiers

CACTVS HASHISYNSC number

PubChem SID/CIDFDA UNII

ChemSpider IDChemNavigator SID

Chemical Formula

/smiles/names, /iupac_name/cas/inchi, /stdinchi/inchikey, /stdinchikey/ficts, /ficus, /uuuuu /image/file, /sdf/mw, /monoisotopic_mass /formula/twirl, /3d/urls/unii/chemspider_id/pubchem_sid/chemnavigator_sid

“identifier” “representation”

http://cactus.nci.nih.gov/chemcial/structure

Chemical Identifier ResolverNCI/CADD Public Web Resources

Page 27: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ-UHFFFAOYSA-N/smiles

Standard InChIKeyChemical Identifier Resolver

• can resolve ~93.0 million Standard InChIKeys into a full structure representation:

CCO

http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ-UHFFFAOYSA/smiles

CCOCC[OH2+]

http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ/smiles

C(C(O)([2H])[2H])[2H]CC(O)([2H])[2H]C(CO)([2H])([2H])[2H]CC[17OH]C(CO)[2H][14CH3]COCCO

Page 28: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

alc Alchemy formatcdxml CambridgeSoft ChemDraw XML formatcerius MSI Cerius II formatcharmm Chemistry at HARvard Macromolecular Mechanics file formatcif Crystallographic Information Filecml Chemical Markup Languagectx Gasteiger Clear Text formatgjf Gaussian input data filegromacs GROMACS file formathyperchem HyperChem file formatjme Java Molecule Editor formatmaestro Schroedinger MacroModel structure file formatmol Symyx molecule filesybyl2/mol2 Tripos Sybyl MOL2 formatmrv ChemAxon MRV formatpdb Protein Data Banksdf Symyx Structure Data Formatsdf3000 Symyx Structure Data Format 3000sln SYBYL Line Notationsmiles SMILESxyz xyz file format

• available formats:http://cactus.nci.nih.gov/chemical/structure/LFQSCWFLJHTTHZ-UHFFFAOYSA-N/file?format=sdf

File RepresentationChemical Identifier Resolver

Page 29: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/chemical/structure/buckyball/image?height=300&width=300&bgcolor=black&bondcolor=white

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?height=200&width=200&symbolfontsize=7&footer="Aspirin"

Aspirin

Structure Image GenerationChemical Identifier Resolver

Page 30: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

TwirlyMolChemical Identifier Resolver

implemented by Noel O'Boyle (University College Cork, Ireland)

Chrome Safari FF3.5/3.6 FF3.0 FF2.0 IE8 IE7 IE6

simple javascript that allows you to render a rotatable/zoomable3D representation of a molecule in your web browser

no plugin is needed, only a modern browser:

Page 31: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• simple viewer:http://cactus.nci.nih.gov/chemical/structure/restasis/twirl

• embed into a web page:

<div id=“canvas” height=“400” width=“400”></div>

<script src=“http://cactus.nci.nih.gov/chemical/structure/restasis/twirl_cached/

canvas” />

TwirlyMolChemical Identifier Resolver

Page 32: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

restasis

Page 33: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://www.coronene.com/blog/

http://chemical-quantum-images.blogspot.com

http://baoilleach.blogspot.com/

TwirlyMolChemical Identifier Resolver

Page 34: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

ethanol

name a specific resolver module:

http://cactus.nci.nih.gov/chemical/structure/CCO/iupac_name?resolver=name

2-[[3-(3-chlorophenyl)-1,2,4-oxadiazol-5-yl]sulfanyl]acetic acid

• e.g. the string “CCO”, can be resolved as SMILES string of “ethanol” abbreviation for “Carboxymethylthio-3-(3-Chlorphenyl)-1,2,4-Oxadiazol)”

Ambiguous IdentifiersChemical Identifier Resolver

http://cactus.nci.nih.gov/chemical/structure/CCO/iupac_name?resolver=smiles

Page 35: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

<?xml version="1.0" encoding="UTF-8" ?> <request string="CCO" representation=“iupac_name">

<data id="1" resolver="smiles" string_class="SMILES String"><item id="1">ethanol</item>

</data><data id="2" resolver="name" string_class="Chemical Name">

<item id="1">2-[[3-(3-chlorophenyl)-1,2,4-oxadiazol-5-yl]sulfanyl]acetic acid</item>

</data></request>

XML format:

• e.g. the string “CCO”, can be resolved as SMILES string of “ethanol” abbreviation for “Carboxymethylthio-3-(3-Chlorphenyl)-1,2,4-Oxadiazol)”

Chemical Identifier Resolver

Ambiguous Identifiers

http://cactus.nci.nih.gov/chemical/structure/CCO/iupac_name/xml

Page 36: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

<?xml version="1.0" encoding="UTF-8" ?>

<request string="restasis" representation="urls"><data id="1" resolver="name" string_class="Chemical Name">

<item id="1" classification="exact" database="ChemSpider" publisher="ChemSpider">

http://chemspider.com/structure.4939506</item><item id="2" classification="exact" database="ChemSpider“

publisher="PubChem">http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?sid=43028058

</item><item id="3" classification="exact" database="NLM ChemIDplus"

publisher="NLM">http://chem.sis.nlm.nih.gov/chemidplus/direct.jsp?

result=advanced&regno=059865133[…]

</data></request>

• get the URL of the original structure records:

http://cactus.nci.nih.gov/chemical/structure/restasis/urls/xml

Chemical Identifier Resolver

Database URL Lookup

Page 37: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• get available names:

http://cactus.nci.nih.gov/chemical/structure/CC(=O)Oc1ccccc1C(O)=O/names/xml

Chemical Identifier Resolver

Name Lookup

<?xml version="1.0" encoding="UTF-8" ?> <request string="CC(=O)Oc1ccccc1C(O)=O" representation="names">

<data id="1" resolver="smiles" string_class="SMILES String" description="CC(=O)Oc1ccccc1C(O)=O">

<item id="1" classification="PUBCHEM_IUPAC_NAME">2-acetyloxybenzoic acid</item>

<item id="2" classification="PUBCHEM_IUPAC_OPENEYE_NAME">2-Acetoxybenzoic acid</item>

<item id="3" classification="PUBCHEM_GENERIC_REGISTRY_NAME">50-78-2</item><item id="4"

classification="PUBCHEM_GENERIC_REGISTRY_NAME">11126-35-5</item><item id="5"

classification="PUBCHEM_GENERIC_REGISTRY_NAME">11126-37-7</item><item id="6"

classification="PUBCHEM_GENERIC_REGISTRY_NAME">2349-94-2</item><item id="7"

classification="PUBCHEM_GENERIC_REGISTRY_NAME">26914-13-6</item><item id="8" classification="PUBCHEM_SUBSTANCE_SYNONYM">NCGC00090977-

04</item><item id="9"

classification="PUBCHEM_SUBSTANCE_SYNONYM">KBioSS_002272</item><item id="10" classification="PUBCHEM_SUBSTANCE_SYNONYM">SBB015069</item><item id="11" classification="PUBCHEM_SUBSTANCE_SYNONYM">Aspirin</item><item id="12" classification="PUBCHEM_SUBSTANCE_SYNONYM">D00109</item>

[…]

Page 38: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/blog

/chemical/structure Blog

Page 39: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

In Development

http://cactus.nci.nih.gov/TEST_chemical/structure

Page 40: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• manipulates the structure created from the identifier• new representation is calculated after structure

manipulation

http://cactus.nci.nih.gov/chemical/structure/operator:identifier/representation

“Chemical Operators”Chemical Identifier Resolver

operators: tautomers, canonical_tautomer, addh, removeh, nostereo, rings, …

Page 41: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

N

NH

NH

N

O

H2N

N

NH

N

HN

O

H2N

N

NH

N

N

OH

H2N

HN

N NH

N

O

H2N

N

N NH

N

OH

H2N

HN

N N

HN

O

H2N

N

N N

HN

OH

H2N

HN

N N

N

OH

H2N

HN

NH

NH

N

O

HN

N

NH

NH

N

OH

HN

HN

NH

N

HN

O

HN

N

NH

N

HN

OH

HN

HN

NH

N

N

OH

HN

HN

N NH

N

OH

HN

HN

N N

HN

OH

HN

Tautomers“Chemical Operator”

http://cactus.nci.nih.gov/chemical/structure/tautomers:guanine/”representation”

Page 42: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

• (hopefully) there will be many resolvers from differentproviders with different background:

publishers

commercial databases

free sources and databases: ChemSpider,PubChem, ChEBI, …

• Std. InChI[Key] is the perfect tool to interlink the resolvers

• ChemSpider and NCI/CADD are working on a test protocolfor a federated InChI/InChIKey resolver

IUPAC InChI/InChIKey Resolver

Page 43: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

IUPAC Root Resolver

Resolver 1

Resolver 2

Resolver 3

Resolver 3.1

Resolver 3.2

Resolver 3.3

ClientsChemical Identifier Resolver

IUPAC InChI/InChIKey Resolver

Page 44: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

http://cactus.nci.nih.gov/chemical/structure

Chemical Identifier ResolverNCI/CADD Web Resources

http://cactus.nci.nih.gov/blog

Page 45: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

Acknowledgments

ChemNavigator

Scott Hutton

Tad Hurst

CADD Group, CBL, NCI

Igor Filippov

Noel O'Boyle

Hans-Juergen Himmler (Akos)

Thanks to all database providers!

http://cactus.nci.nih.gov

Our web site:

Page 46: ACS San Francisco 2010 CINF Talk

NCI/CADD: Open-access chemical structure web platform

Users

webel.py - A Cinfony module

IUPHAR DATABASEhttp://www.iuphar-db.org

http://baoilleach.blogspot.com/2009/11/introducing-webel-cheminformatics.html

http://www.akosgmbh.eu/globalsearch/index.htm

avogadro.openmolecules.net/

CACTVS

http://www.xemistry.com

in silico toxicologyhttp://www.in-silico.ch/

Symyx Draw Resolver

http://www.symyx.com/