prediction of ph-dependent solubility of drugs and drug...
TRANSCRIPT
Prediction of pH-Dependent Solubility
of Drugs and Drug Candidates
Niclas Tue HansenCentre for Biological Sequence Analysis
Technical University of Denmark
Workshop Chemoinformatics in Europe, Obernai, France, 29 May – 1 June 2006
Why consider pH-dependency?
N
NO
O
O
O
CH3CH3
CH3
CH3
CH3
CH3
CH3Verapamil
1000xdifference
In Theory: Henderson-Hasselbalch
Acids: ( )apKpHLogLogSLogS ++= 101
0
-6,0
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
0,0 2,0 4,0 6,0 8,0 10,0 12,0 14,0
p H
In Theory: Henderson-Hasselbalch
Bases: ( )pHpKbLogLogSLogS ++= 1010
-6,0
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
0,0 2,0 4,0 6,0 8,0 10,0 12,0 14,0
p H
In Theory: Henderson-Hasselbalch
Ampholytes:
( )pHpKpKpH baLogLogSLogS +++= 101010
-6,0
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
0,0 2,0 4,0 6,0 8,0 10,0 12,0 14,0
p H
-6,0
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
0,0 2,0 4,0 6,0 8,0 10,0 12,0 14,0
p H
-6,0
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
0,0 2,0 4,0 6,0 8,0 10,0 12,0 14,0
p H
Modular pH-dependent model
Molecule
Predict solubility Predict pKa/pKb
Henderson-Hasselbalch
Equation
Solubility pH-curve
Module 1 Module 2pH-dependent Model
Predict pKa/pKb
Henderson-Hasselbalch
Equation
Predict
solubility
Intrinsic Solubility Module
Database(solubility+strutures)
Calculate descriptors
Descriptor selection/
Train models
Select one predict model
Validate models
Open literature (378 compounds)
PHYSPROP (4548 compounds)
MOE 2D descriptors (171)
Heuristic Algorithm/
Genetic Algorithm
Common test set (21 compounds)
Lundbeck A/S (25 compounds)
Select best model
Artificial Neural Networks (HOWLIN)
Descriptor selection
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
runs
Pe
ars
on
Co
rre
lati
on
Co
eff
icie
nt
0
20
40
60
80
100
120
140
160
180
200
nu
mb
er
of
de
sc
rip
tors
PCC
descriptors
Heuristic Algorithm Genetic Algorithm
Results
>50
9
>50
5
No.Desc.
Validation(Common Test Set)
Training(3-part validation)
0.82
0.85
0.64
0.75
R2
0.970.640.944548/Heu
1.080.640.944548/GA
1.330.870.90378/GA
1.160.970.86378/Heu
RMSERMSER2
External validation: Others
1.06
0.75
50
694
Kuhne*
45484821297Train/
Test set
0.971.240.63RMSE
0.850.700.91R2
94630# desc.
This workKlopmanHuusk.
* Based on 19 of the 21 molecules
pKa Prediction Module
Two models were validated:– ACD/Labs pKa module– Marvin pKa plug-in
Validation set (PhysProp) with filter:– Experimental pKa values– Temperature range 25±5°C– pKa range 0-13
Result: 467 experimental pKa values
Predict solubility Predict pKa/pKb
Henderson-Hasselbalch
Equation
Validation: ACD/Labs
ACD/Labs model:– Expensive
– Limited rights
Statistics:– N = 454 (13 err.)
– MAE = 0.34
– RMSE = 0.700
2
4
6
8
10
12
14
0 2 4 6 8 10 12 14
Experimental pka
Pre
dic
ted
pk
a
Validation: Marvin
Marvin model:– Free for academia
– Accessible Java-classes
Statistics:– N = 458 (9 err.)
– MAE = 0.48
– RMSE = 0.710
2
4
6
8
10
12
14
0 2 4 6 8 10 12 14
Experimental pKa
Pre
dic
ted
pK
a
Combined model
Input:
Structure
Neural Network (Heu)
Marvin
Predict pKa/pKb
Combine using HH Solubility pH-curve
Calculate
2D Descriptors
Marvin
Molecule conversion
Normalization
Transform to solubility
Predict solubility Predict pKa/pKb
Henderson-Hasselbalch
Equation
Experimental Solubility Curves
Source: literature (~70 drugs)
Filter:– Exact solubility values
– Experimental pKa and S0
– No co-solvents
Result:– 27 experimental solubility curves
Validation of Theory and Prediction
Henderson-Hasselbalch relationship:
– Validation of Theoretical model:with experimental parameters
– Validation of Prediction model:with predicted parameters
( )apKpHLogLogSLogS ++= 101
0
Validation with Ibuprofen
Experimental:– pKa = 4.42
– Log S0 = -3.62
Predicted:– pKa = 4.85
– Log S0 = -3.11
Statistics– RMSEtheo = 0.03
– RMSEpred = 0.37-4
-3,5
-3
-2,5
-2
-1,5
-1
-0,5
0
0 1 2 3 4 5 6 7 8
pH
Lo
g S
Exp
Theo
Pred
Validation of combined model
Theory vs. Exp.:– RMSE = 0.25 (27 drugs)
– RMSE = 0.09 (25 drugs)
Prediction vs. Exp.:– RMSE = 0.88 (27 drugs)
– RMSE = 0.80 (26 drugs)
0
2
4
6
8
10
12
14
16
18
20
0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 >1,0
RMSE (LogS)
Nu
mb
er
of
Mo
lec
ule
s
Theoretical model
0
2
4
6
8
10
12
14
0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 >5,0
RMSE (LogS)
Nu
mb
er
of
Mo
lec
ule
s
Prediction model
Summary
Predict solubility Predict pKa/pKb
Henderson-Hasselbalch
Equation
This model seems adequate for prediction of pH-dependent solubility
Good pKa/pKb predictions enable us to to predictpH-dependent solubilities without increasing theerror significantly
Acknowledgements
Supervisors:– Irene Kouskoumvekaki– Svava Ósk Jónsdóttir– Flemming Steen Jørgensen (DFU)– Søren Brunak
Others:– Thomas Pontén– Olivier Taboureau– Jens Pontoppidan
Thanks… Questions?!