introduction overall methodology key points · roy k, kar s, das rn, understanding the basics of...
TRANSCRIPT
“INTELLIGENT” CONSENSUS PREDICTIONS FOR DAPHNIA TOXICITY OF AGROCHEMICALS Pathan Mohsin Khan 1 , Kunal Roy 2,3 , Emilio Benfenati 3
1Department of Pharmacoinformatics, National Institute of Pharmaceutical Educational and Research (NIPER), Chunilal Bhawan, 168, Manikata Main Road, 700054 Kolkata, India 2 Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
3 Laboratory of Environmental Chemistry and Toxicology, Istituto Di Ricerche FarmacologicheMario Negri IRCCS, Via La Masa, 19, 20156, Milano, Italy
Introduction
Overall Methodology
Key Points
Why Consensus?
Error estimation and predictivity comparison
Molecular descriptors
(Dragon+ PaDEL)
Validation Parameters
[MAE, Tropsha,
rm2]
External
(Q2F1, Q2F2,
rm2Test,
MAE95%)
Internal
(R2, Q2,rm2
LOO, MAE95%)
References
Agrochemicals : A broad class of chemical products widely used in the
agriculture to prevent, destroy, or control the harmful organisms (insects,
fungi, microbes and weeds) or diseases, or to protect the crops before
and after harvesting to minimize the loss or to enhance the yield in
production.
Over the last few years, the ecotoxicological hazard potential of
agrochemicals has received much attention in the industries and
regulatory agencies.
There are only limited experimental ecotoxicological data available for
such compounds.
Quantitative structure-toxicity relationship (QSTR) modeling is a ligand
based statistical approach proved to be useful in data gap filling.
In the present work, we have generated QSTR models for daphnia toxicities of different classes of
agrochemicals (fungicides, herbicides, insecticides and microbiocides) employing only simple and
interpretable two-dimensional descriptors, and subsequently strictly validated using test set compounds.
The validated individual as well as global models were subjected for the “intelligent” consensus model
generation using the ICP tool (http://dtclab.webs.com/software-tools) with an objective to improve the
prediction quality and reduced prediction errors .
The individual as well as consensus models were used to predict the toxicity of an external dataset of
biocides to determine the predictive ability of models.
As per the developed models, generally, lipophilicity, number of X (halogen) on an aromatic ring,
number of substituted benzene C(sp2), number of chlorine atoms, frequency of C - Cl at topological
distance 5, number of multiple bonds, number of heavy atoms, number of rotatable bonds, and an
increase in carbon chain length increase the toxicity while polarity, presence of ether moiety in aliphatic
chain, presence of two oxygen atoms at a topological distance 8, branching in molecules, count of
hydrogen bond acceptor atoms and/or polar surface area decrease the toxicity.
Ring
descriptors
E-state
descriptors
Molecular
properties
Connectivity
indices
ETA
descriptors
Functional group
count
2D atom
pair
Validation
External set prediction
(Biocides dataset)
ECOSAR
Comparison
Summary of feature responsible for toxicity of agrochemicals
Comparison between our models and ECOSAR prediction
A single model can’t guarantee the best quality predictions for all compds
Entire chemical space is not covered in a single model while consensus
combines multiple features of different models covering wider range
Helps to reduce error of predictions
Four types of consensus proposed:
I. CM0:- Simple average of predictions
II. CM1:- Average of predictions from the 'qualified' individual models
III. CM2:- Weighted average predictions from 'qualified' Individual models
IV. CM3:- Best selection of predictions (compound-wise) from 'qualified'
Individual models.
Prediction of a models are not reliable unless compared with standards and used
external dataset compounds.
we have employed an external dataset of 67 biocides, The quality of predictions
(R2pred) for three individual models were 0.47, 0.50 and 0.47 with mean
absolute error of 1.407, 1.395, and 1.422 respectively, while the prediction
quality for the consensus model-3 is 0.49 but the mean absolute error reduced to
1.37.
Comparison of error (RMSEp) was made with ECOSAR
ECOSAR is preferred widely for ecotoxicological prediction of organic
chemicals
Comparison was made only with test set of the models.
Our models offered better predictive efficiency and larger chemical domain.
Consensus models offered better predictivity when compared with simple
QSTR models.
Global models
Ind
ivid
ual
Mod
els
Fu
ngic
ides
model
s
mic
robio
cid
es m
odel
s
Her
bic
ides
model
s
Inse
ctic
ides
model
s
Ntrain = 81 and Ntest = 26 Ntrain = 36 and Ntest = 12
Ntrain = 112 and Ntest = 35 Ntrain = 111 and Ntest = 36
Global models
Waxman MF, The agrochemical and pesticides safety handbook. CRC Press, 1998.
Roy K, Kar S, Das RN, Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk
Assessment, Academic Press, NY, 2015.
Roy K, Ambure P, Kar S, Ojha PK, Is it possible to improve the quality of predictions from an“intelligent” use of multiple
QSAR/QSPR/QSTR models? J Chemom 32, 2018, e2992.
US EPA, The ECOSAR The ECOSAR (ECOlogical Structure Activity Relationship) Class Program, 2012.
Acknowledgement PMK thanks the Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India for a fellowship.
KR thanks the European Commission for financial assistance under the project VERMEER [LIFE16 ENV/IT/000167].
GADCV
PLS
Ntr
ain
= 3
13
Nte
st =
105
Daphnia toxicity data
(pEC50 values)
Flutianil
pEC50 = 6.98
Propiconazole
pEC50 = 4.02
Pentachlorophenol
pEC50 = 5.95
Acetic Acid
pEC50 = 2.08
Cyfluthrin
pEC50 = 9.23
Propylene glycol
pEC50 = 1.83
Tributyltin methacrylate
pEC50 = 2.95
Tributyltin oxide
pEC50 = 7.90