Derek Nexus and Sarah Nexus: working together for ICH M7
European ICGM, September 2014
Dr Nicholas Marchetti Product Manager
Derek Nexus and Sarah Nexus: working together for ICH M7
OUTLINE
• Impact of changes driven by M7 • In silico solutions
• Vitic Nexus – an authoritative toxicity database • Derek Nexus – the leading expert system • Sarah Nexus – an advanced statistical system
• Expert assessment from 2 predictions
What does M7 cover?
• Harmonises guidelines – FDA, EMA, Japan
• Recognises the primacy of the Ames assay
identification
qualification
categorisation Control of mutagenic impurities to limit potential carcinogenic risk
Evaluate drug substance, impurities, degradants, (metabolites), intermediates…
Databases, in-house, literature..
2 x in silico QSAR
Known mutagen Predicted positive Predicted negative
Ames test
Limit according to TTC or present purge argument for absence
Treat as non-mutagenic
Known non-mutagen
Focussing on the identification step
Leadscope
Multicase
Expert Review
Expert Review
Derek Nexus and Sarah Nexus: working together for ICH M7
OUTLINE
• Impact of changes driven by M7 • In silico solutions
• Vitic Nexus – an authoritative toxicity database • Derek Nexus – the leading expert system • Sarah Nexus – an advanced statistical system
• Expert assessment from 2 predictions
Vitic Nexus – an authoritative toxicity database
• Vitic Nexus is a repository of toxicological data • Data donated by members • Curated and augmented by expert scientists
• Genotoxicity records • In vitro data – 146,444 records, 9,014 compounds • In vivo data – 10,157 records, 2,658 compounds • Overall call – 15,289 records, 8,510 compounds
• Contains public datasets and literature including • Benchmark, CGX, ISSSTY, IUCLID • FDA CDER & CFSAN, • JETOC (Japanese Chemical Industry Ecology-Toxicology..) • IARC, JETOC, NIHS, NTP, SCCP, SIDS…
• Members also store their own data in Vitic Nexus
Data sharing consortia
• Lhasa facilitate pre-competitive data sharing
• Members of these consortia also see • Aromatic amines
• 1,664 records • 145 compounds
• Intermediates (includes boronic acid sub-group) • 13,834 records • 910 compounds
• Excipients • 2,286 records • 764 compounds
in silico predictions for M7 • Use models that predict Ames outcomes
• 2 complementary methods should be applied • One expert rule-based • One statistical-based • Models should follow OECD Principles for QSAR • The absence of alerts from both is sufficient to conclude that the
impurity is of no concern • Expert review is needed to provide additional evidence for
any prediction …and to explain conflicting results
Derek Nexus and Sarah Nexus: working together for ICH M7
OUTLINE
• Impact of changes driven by M7 • In silico solutions
• Vitic Nexus – an authoritative toxicity database • Derek Nexus – the leading expert system • Sarah Nexus – an advanced statistical system
• Expert assessment from 2 predictions
Enhancing Derek Nexus for mutagenicity
• Designed to support expert analysis for M7
• Provide additional supporting information
• Recommend where expert should focus analysis
Derek Nexus and Sarah Nexus: working together for ICH M7
OUTLINE
• Impact of changes driven by M7 • In silico solutions
• Vitic Nexus – an authoritative toxicity database • Derek Nexus – the leading expert system • Sarah Nexus – an advanced statistical system
• Expert assessment from 2 predictions
Sarah Nexus – an advanced statistical system
• Designed to address the ICH M7 guidelines
• Created with input from the FDA under a Research Collaboration Agreement
Making a prediction
• Query compounds are fragmented
• Each fragment is assessed
• Fragments not covered by the training set result in no prediction
out of domain
• Relevant hypotheses for each fragment are retrieved
• Hypothesis, signal, confidence, supporting examples
• Typically several hypotheses are returned
• Overall Prediction = ∑ 𝑓 (prediction, confidence)hypotheses
• Absence of a strong overall signal equivocal
0
0.2
0.4
0.6
0.8
1
0-20% 20-40% 40-60% 60-80% 80-100%
Confidence correlates with accuracy
FP 22%
FN 18% TP
31%
TN 29%
FP 13%
FN 10%
TP 37%
TN 40%
FP 9%
FN 2%
TP 50%
TN 39%
FP 4%
FN 2%
TP 60%
TN 34%
FP 6%
FN 1%
TP 70%
TN 23%
𝑃𝑃𝑃 =𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑁𝑃𝑃 =𝑇𝑁
𝑇𝑁 + 𝐹𝑁
𝑏. 𝑎𝑎𝑎 =𝑠𝑠𝑠𝑠 + 𝑠𝑠𝑠𝑎
2
Sarah confidence score
Confidence vs PPV
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100%
PPV
Confidence
Sarah Nexus Performance • Sarah Nexus has been extensively evaluated by members
0%
20%
40%
60%
80%
100%
Coverage Balanced accuracy Specificity Sensitivity
Private 1, n= 744, 28% +ive
Private 2, n = 847, 12% +ive
Private 3, n= 437, 16% +ive
Private 4, n = 986, 4% +ive
Private 5, n = 1718, 14% +ive
Private 6, n = 320, 23% +ive
FDA, n=809, 36% +ive
Public, n = 11209,49% +ive
83-96%
60-85% 60-89%
38-84%
TP TP + FN
TN TN + FP
sens + spec 2
Sarah Nexus v1 under recommended settings
Presented @ SoT, March 2014
Sarah Nexus - Summary
• Sarah is a statistical approach to mutagenicity
• Maintains high coverage even with challenging datasets
• Provides information needed for expert analysis
The use of integrated in silico solutions under the proposed ICH M7 guidelines
OUTLINE
• Impact of changes driven by M7 • In silico solutions
• Vitic Nexus – an authoritative toxicity database • Derek Nexus – the leading expert system • Sarah Nexus – an advanced statistical system
• Expert assessment from 2 predictions
Using in silico predictions
• M7 explicitly states that in silico predictions should be reviewed with expert knowledge • Provide supportive evidence for any prediction • Elucidate underlying reasons in case of conflicting results
• But how will this work in real life? • In silico methods combined with expert knowledge rule out mutagenic
potential of pharmaceutical impurities: An industry survey • Regulatory Toxicology and Pharmacology, 2012, 62, 449–455
• Use of in silico systems and expert knowledge for structure-based assessment of potentially mutagenic impurities
• Regulatory Toxicology and Pharmacology, 2013, 67, 39
2 complementary methodologies should be applied
Expert system Statistical system
Data • uses all Lhasa data including
consortia & donated confidential data + data mined on-site
• only uses non-confidential data
methodology • expert system • human-written rules based
upon data & knowledge
• statistical model • machine-learning model using
a hierarchical network
scope of alert • hand-written Markush • fragments learnt by model
interpretability
• references • expert commentary • mechanistic explanation • scope of alert • some supporting examples
• transparent methodology • learning summarised by
hypothesis • direct link to training set • confidence in prediction
Using Sarah and Derek together
• How often do they disagree?
• When they agree, how accurate are they?
0%
20%
40%
60%
80%
100%
Agreement betweenDerek Nexus and Sarah
Nexus
Balanced accuracy forconcurring predictions
Private Dataset 1
Private Dataset 2
Private Dataset 3
Public Dataset
69-85% 62-90%
Acknowledgements : All the Lhasa members who worked closely with us during the evaluation and development of Sarah
Using Sarah and Derek together
• A simple conservative approach will increase sensitivity
..but at the cost of accuracy and specificity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
,
sensitivity
specificity
0.68
0.72
0.74
0.83 =
accuracy Private dataset
Using Sarah and Derek together
• When they disagree, which is right?
72%
7%
14%
7%
73%
9%
11%
7%
80%
9%
7% 5%
27% 17% 25% 31%
Private Dataset 3 Public Dataset
Handling conflicting predictions
• Confidence scores can give an indication…
• Machine-learnt & expert driven rules have been assessed • If both models agree
• Take that consensus prediction • If one model has a high confidence prediction
• Take the most confident prediction • If Derek says ‘positive’ and Sarah has a positive hypothesis
(despite being negative overall) • Activity is most likely
• If the positive prediction is of low confidence • Activity is unlikely
• ….
Handling conflicting predictions
• Simple rules give increased coverage without loss of accuracy
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
Accuracy Sensitivity True accuracy Coverage
Private Dataset 1 Step 1
Step 2
Step 3
Step 4
D and S agree Most confident prediction D says positive,
S has positive hypothesis Low confidence positive
Ultimately, expert review is needed…
• Decision trees may help guide an expert, but expert review is still essential
• We have worked with our members to deliver the information needed for expert review
Supporting the expert workflow
• Derek prediction • Predicted negative but
there is a ring system to assess
Supporting the expert workflow
• Derek Nexus now shows those compounds from the Lhasa Ames test reference set most closely related to the query
Supporting the expert workflow
• Step 2 – Sarah prediction
• Sarah predicts negative; no positive hypotheses seen
• Derek and Sarah analysis agree • Supporting data from Vitic augments this prediction
Supporting the expert workflow
• Step 3 – Vitic search – similarity chosen
• Vitic shows a related ‘active’ for which there is no obvious
cause (no Derek alert fires) and also a related ‘inactive’ • Expert assessment – ring system not of concern
Possible reasons to over-rule a positive in silico call
• The presence of a second confounding alert that could have caused the activity • …a risk with statistical models
• Minimised with Sarah’s recursive learning approach
• Mechanistic interpretation • …stereo-electronics preclude reaction through the accepted
mechanism such as that described within Derek
• Similar analogues trigger the same alert and have been tested as inactive • …were not known to the model
What our members say…
• “Combined use of two complementary in silico systems such as
Derek Nexus and SEP leads to an increase in negative
predictivity and sensitivity, up to 99.1% and 94.7% respectively”
Poster “Comparative Evaluation of in Silico Systems for Ames Test Mutagenicity Prediction”
Ilse Koijen… Janssen, GTA Newark Oct 2013, www.gta-us.org/scimtgs/2013Meeting/posters2013.html
SEP = the pre-release version of Sarah
Summary
• M7 will allow predictions of mutagenicity to be submitted • Derek has been extended to increase support for expert review
• Making confident predictions of inactivity • Highlighting features worthy of attention
• Sarah has been designed to provide the statistical 2nd system • Recursive learning and a hierarchical network provide transparency and
accuracy • The performance of combined predictions has been described
• Using a number of relevant confidential datasets • Examples of expert decision-making illustrate their application
• Use of Vitic, an authoritative database supports this workflow