genomics research institute university of cincinnati compound library wm. l. seibel january 10, 2007
TRANSCRIPT
Genomics Research InstituteGenomics Research InstituteUniversity of CincinnatiUniversity of Cincinnati
Compound LibraryCompound Library
Wm. L. Seibel
January 10, 2007
OverviewOverview
Library Overview
Compound Characteristics– Design Concepts– Drug-Like
Library Screening Options
Summary of Library Advantages
Compound Repository Compound Repository
Haystack Neat Compound Storage– Capacity = 200,000 bottles– Current = 207,000 bottles– Freezer storage when appropriate
Solar (Solution Archive) – DMSO solutions– Capacity = 1.8 million tubes, 10,000 deep well (96) plates, 13,600
shallow well (384) plates– Current = 325,000 unique compounds
Related Compound Handling and Dissolution instruments. Housed at P&G’s Mason Business Center in ca. 3000 sf lab
space
HaystackHaystack®® Neat Chemical Storage Neat Chemical Storage
HaystackHaystack®® Neat Chemical Storage Neat Chemical Storage
HaystackHaystack®® Neat Chemical Storage Neat Chemical Storage
SolarSolar® ® Solution StorageSolution Storage
SolarSolar® ® Solution StorageSolution Storage
SolarSolar® ® Solution StorageSolution Storage
Library Design PrinciplesLibrary Design Principles
TargetIdentification
TargetValidation
CompoundScreening
LEAD
CompoundOptimization
Drug Lead DiscoveryDrug Lead Discovery …greatly simplified…greatly simplified
Compound Selection can greatly enhance
Efficiency
H2L ActivityConfirmation
Library CompoundsLibrary Compounds
The UC/GRI Compound Library is comprised of compounds from four general categories:
– 1. Compounds purchased from numerous sources selected to provide a diverse representation across “drug-like” structural properties.
– 2. Compounds purchased that specifically target kinases and GPCRs
– 3. Compounds prepared in-house specifically for projects in kinases, GPCRs, phosphatases, ion channels and proteases donated from P&G Pharmaceuticals.
– 4. Combinatorial Chemistry contract syntheses (Lower Priority Cmpds).
This screening library is broadly diverse across drug-like space, with enhanced concentrations in areas of key biological relevance, including notably, kinases and GPCRs.
P&G Pharma Selected CompoundsP&G Pharma Selected Compounds
Chemically Diverse– Represented uniformly across drug-like space.– Want to ensure uniform, comprehensive and
diverse representation of compounds across the structural & property types that are typical of drugs and lead structures.
Compounds selected based on drug like properties (within “Drug-Like Space”)– Chemical and Property Filters– Lipinski, Veber etc. rules
Total P&G investment to assemble repository = $22 M (over past 10 years)
VendorDatabase
Remove duplicates
Remove reactives, Unusual groups, & toxicophores(80 substructures)
MW filterSolubility Filter
Lipinski Rule of Five•> 5 H-bond donors •MW < 500 •c log P < 5 N's + O's < 10
“Cleaned”database
26 databases
>4 million structures
Chemical Property FiltersChemical Property Filters
Diversity AnalysisDiversity Analysis
Describing Molecular StructureDescribing Molecular Structure
Convert molecular structure into numerical values bymaking computations of specific structural features
N
N N
CH3
OH
StructureComputations Numeric
Descriptors
Relevant to Binding Functions
Diversity Assessment MethodologyDiversity Assessment Methodology
Used BCUT descriptors *
– R.S. Pearlman, UT at Austin– DiverseSolutions (now available from Tripos)
Computed ~120 BCUTs
Selected a best subset of 6 BCUTs– 6D space – visualization is a challenge
* J. Chem. Inf. Comput. Sci. 1999, 39, 11-20.
Pearlman’s Pearlman’s BCUTBCUT descriptors descriptors **
6D Chem-space (structure-space)– 2 atomic partial-charge descriptors– 2 atomic polarizability descriptors– a hydrogen-bond acceptor descriptor– a hydrogen-bond donor descriptor
* http://www.awod.com/netsci/Issues/Jun96/feature1.html
Concept of Concept of Chemistry SpaceChemistry Space
Desc-1
Des
c-2
Defining Drug SpaceDefining Drug Space
Based on structures of “drug-like” compounds from The World Drug Index (WDI) The Nation Cancer Institute Open Database
Desc-1
Des
c-2
Desc-1
Des
c-2
Diverse Subset SelectionDiverse Subset Selection
Avoiding “redundant” representations
DiversityAnalysis
Compound Supply Compound Supply
External Suppliers (20+ vendors)– Brokerage Houses
Individual Compounds (Diversity) Target Directed Libraries
– Combinatorial Chemistry Companies
Corporate Suppliers– P&G Pharmaceuticals
Focus Areas - Medicinal ChemistryKinase, GPCR, Phosphatase, Ion Channel, proteases
Lead ID – Combinatorial Chemistry
Vendor “Dependability”Vendor “Dependability”
vendor
# cmpds evaluated
by MS
# compounds that were
present in MS
# compounds that were not present in MS
Analytical confirmation
rate
ave conc (mM)
ave purity
(%)
# times a compound present in
MS confirmed in an HTS assay
160 143 17 89.4% 1.9 95.9 0
171 142 29 83.0% 1.9 94.5 7
182 145 37 79.7% 6.4 94.0 34
31 23 8 74.2% 9.5 98.3 3
649 470 179 72.4% 5.8 93.1 117
1137 788 349 69.3% 5.1 91.1 426
47 32 15 68.1% 6.2 79.0 15
20 13 7 65.0% 4.4 93.9 5
230 0 0 63.3% 4.6 91.0 0
16 10 6 62.5% 6.1 91.7 7
258 152 106 58.9% 4.9 76.4 227
1906 962 944 50.5% 8.3 77.4 1803
651 266 385 40.9% 2.5 61.2 115
54 18 36 33.3% 3.9 87.0 14
48 15 33 31.3% 4.4 90.2 19
Totals: 5560 3179 2151 2792Averages: 62.8% 5.1 87.7
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
On the Other Hand…On the Other Hand…
Even within Drug-like space, certain classes can be somewhat clustered.
This library therefore has added “focused libraries” from internal synthesis and external vendors emphasizing compounds relevant to:– GPCRs– Kinases
P&G Pharma Selected CompoundsP&G Pharma Selected Compounds
Defined by experienced medicinal chemists– Broad, uniform distribution across Drug Space with
concentrations of density in key areas from directed purchase and in house synthesis.
Compare to 5 vendor screening collections– 3,000 to 500,000 compounds– 27% - 56% of vendors’ compound collections do NOT
meet criteria for drug-like– UC Compound collection is 2X to 100X more
chemically diverse across Drug Space.
Vendor Libraries are inherently predisposed to clustered groupings.
We can pick the best, most relevant compounds from each.
Screening Library OptionsScreening Library Options
Screening Library Design OptionsScreening Library Design Options
Diverse broad collections– Comprehensive screening against all available compounds
(ca. 250,000 cmpds)– Screening against a representative subset of available
compounds (e.g. 5000 cmpds)
Class-associated compounds– Compounds with structural features often associated with a
particular target (e.g. kinases).
Structure-based compound selection– Virtual Screening of a crystal structure or high quality
homology model to identify the most likely inhibitors (ca. 2000 cmpds), followed by assay of these compounds.
– Virtual Screening as above based on pharmacophore models from known ligands of the target.
Diverse Subset SelectionDiverse Subset Selection
Same Concept as Previously
DiversityAnalysis
5,000 Cmpd Abstract250,000 Cmpd Library
Diverse Subset SelectionDiverse Subset Selection
Execute Assay on subset of compounds
MTSAssay
Identify Hits in Assay5,000 Cmpd Abstract
Diverse Subset SelectionDiverse Subset Selection
Pull Similar Compounds from original 250K Set
SimilaritySearch
300 Cmpd Similarity Library
250,000 Cmpd Library
Diverse Subset SelectionDiverse Subset Selection
Pull Similar Compounds from original 250K Set
MTSAssay
Identify Hits in Assay
300 Cmpd Similarity Library
This Cycle can be repeated several times until no new actives are found
Selection of Nearest
Neighbors of Hits
Biological hit
Near neighbor
Iterative CyclingIterative Cycling
Assay
5000 CmpdRepresentative
Library
~1000 CmpdNN Library
~20 CmpdHit List
NN Search of UC/GRI Library
NN Search of Commercial Compounds
~50 CmpdHit List
Assay
2-3 Iterations
2-3 Iterations
Final Set
Final Hit List
~1000 CmpdNN Library
Diverse Subset SelectionDiverse Subset Selection
Pull Similar Compounds from Commercial 4.8M Set
SimilaritySearch
4.8 M Commercial Library 300 Cmpd Similarity Library
Assay for actives, and cycle hits back through similarity search loop.
Class-Associated CompoundsClass-Associated Compounds
Select compounds similar to compounds known to intereact with target class
SimilarityAnalysis
250,000 Cmpd Library 15,000 Cmpd Library
Target Active
Virtual ScreeningVirtual Screening
Screen GRI/UC library
Screen Commercial Cmpds
Iterative CyclingIterative Cycling
Assay
5000 CmpdRepresentative
Library
~1000 CmpdNN Library
~20 CmpdHit List
NN Search of UC/GRI Library
NN Search of Commercial Compounds
~50 CmpdHit List
Assay
2-3 Iterations
2-3 Iterations
Final Set
Final Hit List
~1000 CmpdNN Library
Hits of any origin can enter the cycle at this point.
Hit to Lead Follow-up (H2L)Hit to Lead Follow-up (H2L)
Hit to Lead Follow-up (H2L)Hit to Lead Follow-up (H2L)
How to determine optimal hits for follow-up– Confirm ID and activity of hits– Cluster into groups of related compounds– Develop preliminary SAR info on each cluster
ID Key features for binding & selectivity– Assess Each Cluster for optmization
“Which compounds have fewest problems?” Synthetic Ease Proprietary Assessment Selectivity Issues Physical Properties Metabolic Handles Cellular Activity
SummarySummary
UC/P&GP Library AdvantagesUC/P&GP Library Advantages
Quality Advantages– Library carefully constructed to span drug-like space.– Compounds restricted to those with properties consistent with clinical materials. – Proven to produce viable hits for follow-up programs.– Comparisons have uniformly been favorable relative to commercial vendor sets.– Includes targeted subsets of compounds for key areas: GPCRs, Kinases,
Phosphatases, Ion channels.
Practical Advantages– SD file of structures and ID tags furnished for unrestricted use.– Many compounds from commercial sources, so resupply likely to be easy.– Materials supplied in microtiter plates (96 or 384) as requested.– Solution Stores made from local dry stores, so follow-up assays will be rapid.
Technical Advantages– Act as Liaison with screening group (internal or external).– Participate in advisory committee for compound acquisition decisions.
Library Use and Data Interpretation Library Use and Data Interpretation
Library Design Assistance – Computational assistance in selecting diverse subsets or
directed subsets.– Computational assistance in selecting compounds similar to
known leads (Nearest Neighbor).– Computational assistance in virtual screening by
pharmacophore or protein docking.
Library Use and Data InterpretationLibrary Use and Data Interpretation
Follow-up Assistance– Resupply assistance, synthesis info, supplier info– Assistance in obtaining related available compounds
(Similarity, substructure, Unity, Pharmacophore).– Provide preliminary lit search info (known info, IP, etc) on
prominent hits.– Clustering of hits into chemical/pharmacophore classes
included.– Provide help identifying chemistry groups with related
interests for collaborations– Provide assistance in connecting with contract chemistry
services (consult).
QuestionsQuestions
Thank youThank you
AcknowledgementsAcknowledgements
Operations– Stacey Frazier– Kathy Gibboney
Computational– Matt Wortman – David Stanton– Prakash Madhav
Management– Ruben Papoian– Sandra Nelson– Joseph Gardner– Kenny Morand