solutions for cheminformatics

49
Solutions for Cheminformatics 21-25 Nov, 2010, Hyderabad Library Compound Design Methods for Custom Library Synthesis

Upload: hyatt-thomas

Post on 31-Dec-2015

42 views

Category:

Documents


2 download

DESCRIPTION

Library Compound Design Methods f or Custom Library Synthesis. Solutions for Cheminformatics. 21-25 Nov, 2010 , Hyderabad. Offers. Usage. Library design by ChemAxon. DB. DB. Fragmentation R-group decomposition Fragmentation Reagent clipping. Enumeration Fuse fragments - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Solutions for Cheminformatics

Solutions for Cheminformatics

21-25 Nov, 2010, Hyderabad

Library Compound Design Methods for Custom

Library Synthesis

Page 2: Solutions for Cheminformatics

Offers

Page 3: Solutions for Cheminformatics

Usage

Page 4: Solutions for Cheminformatics

Library design by ChemAxon

DB

DB

Databases

ReactionsMoleculesMarkush structuresQueries

Compound selection

Similarity searchesSubstructure searches

Enumeration

Fuse fragmentsR-group compositionReaction enumerationMarkush enumeration

Library analysis

Clustering2D similarity screen3D Shape similarity screen

Fragmentation

R-group decompositionFragmentationReagent clipping

Page 5: Solutions for Cheminformatics

Library design by ChemAxonChemAxon Technology

Chemical data storage JChemBaseJChem Cartridge for Oracle

Chemical data search JChem search technology

Chemical data visualization JChem for Excel – Marvin Instant Jchem - Marvin

Chemical data characterization Calculator plugins – logP, pKa ...

Enumeration Reactor – reaction enumerationMarkush enumerationR-group compositionFragment fusion

Fragmentation FragmenterR-group decomposition

Analysis JKlustorScreen 3D - Screen 2D

Page 6: Solutions for Cheminformatics

Databases: displaying content on your desktop

JChem for Excel

Instant JChem

Page 7: Solutions for Cheminformatics

Databases: displaying content: JSP application

• Search technology

• Descriptors

• Alignments

• Chemical Terms filter

• Import / Export /Edit

• AJAX in JChem Webservices

ONLINE TRYOUThttps://www.chemaxon.com

Page 8: Solutions for Cheminformatics

Building blocks for library enumeration

Instant JChem

- Fragmentation

JChem for Excel

- R-group decomposition

Command line

- Fragmentation

- R-group decomposition

Page 9: Solutions for Cheminformatics

0.47 0.55

0.57

0.28

0.20

0.06

Which fragments?

regular Tanimoto

optimized Tanimoto

Optimization of Similarity search metrics: ECFP/FPCP/Chemical FP/ Pharmacophore FP

Page 10: Solutions for Cheminformatics

Similarity searching statistics

1

10

100

1000

10000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Number of Active Hits

Num

ber

of H

its

Tanimto Euclidean Optimized Ideal

Page 11: Solutions for Cheminformatics

Enumeration

Output files ChemAxon technology

Fragments Fragment fusion

Markush structures Markush enumeration(search without enumeration)

Reactants – generic reactions

Reaction enumeartion

R-tables R-group composition

Page 12: Solutions for Cheminformatics

Enumeration

R-table Markush structure

Page 13: Solutions for Cheminformatics

ChemAxon in Knime

Page 14: Solutions for Cheminformatics

Reaction Enumeration

EXCLUDE: match(reactant(1), "[Cl,Br,I]C(=[O,S])C=C") ormatch(reactant(0), "[H][O,S]C=[O,S]") ormatch(reactant(0), "[P][H]") or(max(pka(reactant(0), filter(reactant(0), "match('[O,S;H1]')"), "acidic")) > 14.5) or(max(pka(reactant(0), filter(reactant(0),"match('[#7:1][H]', 1)"), "basic")) > 0)

Page 15: Solutions for Cheminformatics

ChemAxon in Knime

Page 16: Solutions for Cheminformatics

Library analysis

• Characterisation of library:– Fragments - Fragmenter– Molecular descriptors – Calculator plugins

Page 17: Solutions for Cheminformatics

Library analysis – 3D shape similarity search

Test on DUD

1% Enrichment

ADA CDK2 DHFR ER FXA HIVRT NA P38 thrombin TK trypsin0

5

10

15

20

25

30

35

40

Surflex-sim

ROCS

FlexS

ICMsim

CXN-H

Per

cen

t o

f th

e ac

tive

s fo

un

d

Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Page 18: Solutions for Cheminformatics

Library analysis

Wide range of methods• Unsupervised, agglomerative

clustering• Hierarchical and non-hierarchical

methods • Similarity based and structure

based techniques

Flexible search options • Tanimoto and Euclidean metrics,

weighting• Maximum common substructure

identification• chemical property matching

including atom type, bond type, hybridization, charge

Page 19: Solutions for Cheminformatics

Use cases

Page 20: Solutions for Cheminformatics

Use cases

Page 21: Solutions for Cheminformatics

Use cases

Page 22: Solutions for Cheminformatics

Use cases

Page 23: Solutions for Cheminformatics

Use cases

Page 24: Solutions for Cheminformatics

Use cases

Page 25: Solutions for Cheminformatics

Use cases

Page 26: Solutions for Cheminformatics

Use cases

Page 27: Solutions for Cheminformatics

Use cases

Page 28: Solutions for Cheminformatics

Target-focused libraries: rapid selection of potential PDE inhibitors from

multi-million compounds’ repositories

Why do we need rapid selection of target- focused libraries?

Design inputs

2D similarity searching strategy

Property-based filtering

Seed/ chemotype representation (diversity)

Conclusion/ Proposals

TargetEx Ltd., György Dormán

Page 29: Solutions for Cheminformatics

Target -focused libraries via Virtual Screening

H-bond Acceptor

AromaticH-bond donor

Cation

Docking Target structure

Source CompoundsCommercial Samples

Combinatorial Libraries/Historical collectionsDe Novo Compounds

Known Active Compounds

2DSubstructure-

Similarity SearchingPartitioningData FusionClustering

KernelsSVM

3D Pharmacophore

Shape Similarity3D/4D-QSAR

Final Visual InspectionAcquisition

Plating

FilteringADMET

Lead-likeness

Biological testing2D fingerprint

Focused library

TargetEx Ltd., György Dormán

Page 30: Solutions for Cheminformatics

2D similarity selection

April 19, 2023

Page 31: Solutions for Cheminformatics

Similarity searching strategy: execution• Setting the starting similarity level (dependent on the

fingerprint S/W, T= 60-75 % for ChemAxon)

• Iteration based on the results (scenarios):

• the number of virtual hits are between 50 and 500, OK

• the number of virtual hits are <50 or >500– if <50 lower the similarity treshold with 5 %– if >500 increase the similarity treshold with 5 %– This can be continued until the optimal range achieved– If 5 % decrease results in >500 compounds the search can

be refined by 2% (alternatively a diversity selection would be needed, but that is not available)

– Duplications can be removed when merged the resulting DBs

Page 32: Solutions for Cheminformatics

How to reduce the number of the hits?Normally screening companies would like to buy 100-1000

compounds

• Since from the various vendor DBs we can obtain 2000- 10.000 virtual hits their number can be reduced

• 1. Applying the reference property space (Lipinski and Veber rules) (IJchem OK)

• 2. There are overrepresented seeds thus virtual hits coming from those seeds can be reduced (IJchem OK)

• 3. Applying an optimal distribution of the resulting chemotypes (removing the overrepresented compounds) (Limited with Jklustor)

• 4. Simple diversity analysis (JKlustor)

Page 33: Solutions for Cheminformatics

TargetEx Ltd., György Dormán

1. Applying the reference property space: Structural determinants: H-bond donor/ acceptor, hydrophobic interactions

(property space determination)

Pharmacophore fingerprints requires more computation and time consumingIn simple similarity search pharmacophore features can only be considered as statistical features (not connected to structures)The similarity search results can be filtered based on the physico-chemical parameter space of the seed compounds (+10/-10 % range applied)

Page 34: Solutions for Cheminformatics

Results and further reduction

• Similarity search results: 8655

• After property filtering: 2009

• 2. There are overrepresented seeds thus virtual hits coming from those seeds can be reduced

• When combining the similarity search the contribution of the seeds can be controlled (or set the number of analogues derived from certain seeds)

Page 35: Solutions for Cheminformatics

2. Overrepresented seedsSeeds leading to highest number of similar hits

HN

NN

N

O

CH3

CH3

CH3

SN

NH

3C

O

O

O

#4 (Sildenafil)238 analogues(60 % similarity or above)

H3C

O

S

O

O

N

O

N

O

HN

CH3

#13328 analogues(60 % similarity or above)

H3C

O

S

O

O

N

O

N

O

HN

CH3

#18 (desantafil)4494 analogues(60- 80 % similarity)

Page 36: Solutions for Cheminformatics

2. Overrepresented seedsSeeds leading to highest number of similar hits

O

CH3

SO O

N

N

H3C

HN

NN

O

N

CH3

CH3

#27237 analogues(60 % similarity or above)

N

N

HN

O

CH3

O

CH3

NN

#28272 analogues(60 % similarity or above)

N

NCl

HN

N

OH

O

O

O

#30466 analogues(60 % similarity or above)

NH

N

O

O

N

O

O CH3

#442726 analogues(60 % similarity or above)

Page 37: Solutions for Cheminformatics

Recurring structural motifs in the seed structures

Page 38: Solutions for Cheminformatics

Recurring structural motifs in the similarity search results

Page 39: Solutions for Cheminformatics

3. Applying an optimal distribution of the resulting chemotypes Proposed application of JKlustor/LibMCS

• Taking into consideration of the substructure where the maximum number of connection (bond) is found– it can be an option– Maybe difficult to define

• Using such option the „real” core structure can be found easier

Page 40: Solutions for Cheminformatics

Use caseIan BerryEvotec

Page 41: Solutions for Cheminformatics

Evotec Library Profiler

• Aim is to be able to select from a large virtual library either:– A combinatorial subset

• Typically small focussed libraries– A non-combinatorial subset

• Medicinal chemistry projects

• Desirable to allow access to all scientists– Creativity– Share ideas– Security aspect

• Interactive

• Subsets need to satisfy multiple criteria

Page 42: Solutions for Cheminformatics

Workflow

Enumerate Virtual

LibraryExport to file

Import into Esma

Property calculation

Filtering / analysis

Export to file

Import into Jchem for

Excel

Filtering / analysis

Import into Spotfire

Page 43: Solutions for Cheminformatics

Workflow using the Library Profiler

Enumerate Virtual

Library

Select properties

to calculate

Filtering / analysis

Export to fileFurther

analysis

Page 44: Solutions for Cheminformatics

Apr 19, 202344

Page 45: Solutions for Cheminformatics

Charting – Scatter plot

Page 46: Solutions for Cheminformatics

Pivot View - Properties

Page 47: Solutions for Cheminformatics

Using ChemAxon tools

• High usage of Marvin View and Sketch– Easy to integrate

• JChem cartridge for filtering– Experience in using cartridge

• JChem tools for many of the property calculations– HBD, HBA, ROT, AMW, TPSA, Veber Bioavailability, BBB

distribution, undesirable functional groups, Andrews AVERAGE energy, Bioavailability score, Ligand binding efficiency, PGP Substrate prediction, pKa, protonated atom count, non-H atom count

Page 48: Solutions for Cheminformatics

WORKSHOP AT 14:00HANDS-ON SESSION

Focused and diverse library generation by ChemAxon

technology

Page 49: Solutions for Cheminformatics

Visit other technical presentations

www.chemaxon.com