leadscope® genetox expert alerts version 3.0 white paper an …€¦ · that two distinct in...

20
An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016 www.leadscope.com 1 [email protected] Leadscope® Genetox Expert Alerts Version 3.0 White Paper An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

Upload: others

Post on 19-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 1 [email protected]

Leadscope® Genetox Expert Alerts Version 3.0

White Paper

An expert alert system to predict the mutagenic potential of

impurities to support the ICH M7 guideline

May 2016

Page 2: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 2 [email protected]

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline

Abstract

The current International Conference on Harmonisation (ICH) M7 guideline on drug impurities states that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic. This paper outlines the development and use of a new expert rule-based system to predict the results of the bacterial mutagenicity assay. In the development of this system, an initial library of mutagenicity structural alerts was identified from the literature. This process included consolidating the same or similar alert definitions along with their plausible mechanisms from multiple publications. Factors that activate and deactivate the alerts were also identified from the literature and through an analysis of the corresponding data using the Leadscope data mining software. Certain chemical classes, such as primary aromatic amines and alkyl halides, were refined using a SAR fingerprint approach using data from public and proprietary databases to refine the definition of the alert. 290 distinct alerts were identified and these alerts were further validated against a reference database consisting of more than 10,000 chemicals with known bacterial mutagenicity results in addition to proprietary data sources. Only validated alerts with a sufficiently strong association with positive expert-reviewed calls from Salmonella and E. coli strains were included. A prediction of the bacterial mutagenicity assay is made using these validated alerts only when the compound is within the applicability domain of the alert system. A confidence score based on information collected for each alert is provided alongside the positive or negative call. In addition, the most sensitive tester strains per alert are provided to support appropriate follow-up experimental testing of any positive alert prediction. This paper outlines the expert alert system and presents validation results, both as a standalone system and in combination with statistical-based approaches and available experimental data. It describes how information on the alert such as its associated mechanisms as well as historical data can support an expert analysis of the results.

Introduction

Impurities are generated as part of the pharmaceutical manufacturing process. They result from using different reagents, catalysts and solvents in the synthesis of the drug substance and from precursors that carry over into the final product. They are also attributable to subsequent degradation. The International Committee on Harmonisation (ICH) has published a guideline that covers the qualification of mutagenic impurities [1]. The purpose of the guideline is to limit carcinogenicity risk from any impurities present in the drug substance. As such, the guideline’s focus is on identifying DNA reactive substances usually detected using the bacterial reverse mutation assay (Ames) [2]. In the absence of carcinogenicity or Ames data for the actual or potential impurities, a computational structure-activity analysis can be performed to help understand whether a substance can be classified as non-mutagenic. When this negative classification is not possible, further in vitro or in vivo tests are required to support a negative classification or the impurity should be controlled below the limits established in the guideline.

To perform the computational structure-activity analysis, the guideline states that two complementary in silico methodologies should be used in the assessment. One should be expert rule-based and the second should utilize a statistical-based methodology. The methods used in the assessment should adhear to the OECD validation principles for (Q)SAR systems [3] and their prediction results may be reviewed using expert analysis, especially in cases where the results are conflicting. If the results from the two methodologies indicate no mutagenic potential, the impurity can be classified as having no

Page 3: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 3 [email protected]

mutagenic concern. Amberg et al., 2016 has recently documented a series of principles and procedures for completing a computational assessment aligned with the ICH M7 guideline. [42] The guideline also discusses in Note 2 that where it is not feasible to isolate or synthesize the impurity or when compound quantity is limited “…bacterial tester strains … proven to be sensitive to the identified alert” may be used.

An expert rule-based system is one of the methodologies required to support an ICH M7 compliant computational analysis. In this type of system, knowledge concerning different structural features that are associated with mutagenicity is encoded as rules in the system. If these structure-activity rules map onto the test compound being evaluated, it may indicate the compound is mutagenic. A number of current systems use this rule-based methodology including ToxTree[4], HazardExpert [5], DEREK [6,7], and Oncologic [8]. ToxTree is a rule-based system that incorporates a module for mutagenicity which encodes the Benigni-Bossa rule-base. HazardExpert contains a mutagenicity module which uses of a series of toxicophores from the literature. DEREK is an alerts-based system that incorporates mutagenicity rules and generates several levels of prediction including “Certain”, “Probably”, “Plausible”, “Equivocal”, “Doubted”, “Improbable”, “Impossible”, “Open”, “Contradicted”, and “Nothing to report” and also helps to support an assessment of whether an alert is inactive by evaluating misclassified and unclassified features.

The existing rule-based systems were not developed to specifically address the criteria of the ICH M7 guideline. One of the main disadvantages of many current approaches is their limited applicability domain assessment (one of the OECD validation principles). This is a particularly important part of a M7 submission, since the lack of an alert should never be used to state that a compound has no mutagenic concern. This would imply that all possible structural alerts have been identified and that chemistry is static. Aryl boronic acid is an example of new chemistry where a new structural alert recently had to be identified [9].

In addition, the alerts encoded in these systems have generally not been quantitatively assessed using a large database of Ames results. This is important since the guideline explicitly states the necessity for the alerts system to be predictive of the bacterial mutagenicity assay. This also supports transparency in the prediction results by providing the data used to assess and justify the structural alerts.

Decisions compliant with the ICH M7 guideline require a clear positive or negative call. While less precise predictions may be helpful when applying these tools in early discovery for prioritization, they are ambiguous for regulatory needs. Prediction results such as “Probable” or “Plausible” need to be translated into a mutagenic/non-mutagenic assessment. This level of ambiguity may lead to inconsistent application of these tools.

Finally, the existing expert alert systems often do not address the complex relationships between regulatory acceptable statistical model results, alert results, available experimental data and the necessity for expert opinion support tools that are essential when supporting the ICH M7 guideline.

The following paper outlines an expert rule-based system designed to address the needs of regulators and pharmaceutical scientists who must implement the ICH M7 guidance. The alert knowledge base is built using a quantitative assessment of publicly-cited alerts over a large database of Ames assay testing results. For each alert, activating and deactivating factors are enumerated through a systematic analysis of the available Ames data and the literature. The alerts are organized hierarchically, to precisely define the relationship between any alerts containing any specific sub-classes of concern. This rule base is

Page 4: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 4 [email protected]

annotated with plausible mechanisms collected from the literature. This linkage between the data and the biological interpretation of the alerts is critical in supporting a transparent assessment of the results. This paper outlines the methods used to build the alert knowledge base and the methodology used to make predictions. Validation results are presented and a discussion of its use in supporting the ICH M7 guideline, including expert analysis, is provided.

Methods

Reference set

An extensive high quality genetic toxicity database containing the results of the bacterial mutagenicity assay along with chemical structures has been used to support the development of this rule-based expert system. This database, referred to as a reference set, supports the evaluation of the expert rules. The use of the term reference set is used to contrast this type of database from a training set used in building QSAR statistical-based models. This set was not be used to ‘build’ the alerts system, rather to (1) to assess and refine the alerts cited in the literature (in combination with an expert judgment), (2) to support applicability domain assessment,(3) to generate statistics reflecting the confidence in the alert, and (4) identify classes of mutagenic compounds that are not covered by alerts cited in the literature.

The reference set was constructed from three sources, two high quality genetic toxicity databases and data shared with Leadscope from various organizations. The genetic toxicity database sources include: 1) the training sets used to build QSAR models at the US FDA with the Research Collaboration Agreement (RCA) partners (RCA-QSAR) [10,11] and 2) the Leadscope 2016 SAR Genetox database [12,13].

The RCA-QSAR database is comprised of 3,979 chemicals with a Salmonella call aggregated from the common Ames strains (TA98, TA100, TA1535, TA1537) as well as 1,198 chemicals with a composite Salmonella TA102 / E. coli strain call. These datasets contain non-proprietary data harvested from FDA approval packages and the published literature.

The Leadscope 2016 SAR Genetox Database is an extensive collection of genetic toxicology studies (including bacterial mutagenicity, chromosome aberration, mammalian mutagenesis, and in vivo micronucleus) and contains 8,717 chemicals with graded bacterial mutagenicity calls. Information was collected from an electronic source or manually harvested to create the data set. Specifically, the sources included: (1) the US Food and Drug Administration (FDA) Center for Food Additives and Applied Nutrition (CFSAN) Food Additive Resource Management system (FARM) and Priority based Assessment of Food Additives (PAFA) [14]; (2) the US FDA’s Center for Drug Evaluation and Research (CDER) Pharmacology Reviews based on the new drug approval (NDA) documents [15]; (3) the Chemical Carcinogenicity Research Information System (CCRIS) [16]; (4) the National Toxicology Programs (NTP) genetic toxicology database [17]; (5) the Tokyo-Eiken database [18]; (6) the ANEI publicly released data; (7) donated from commercial organizations; and (8) and other publications.

Bacterial mutagenicity data was also shared with Leadscope by a number of organizations for inclusion in the alert reference set. This information was used to improve the predictive performance of alerts for specific chemical classes as well as to expand the applicability domain of the alerts system.

The process of constructing the database was performed in two steps. First, the chemical structures were merged using a chemical registration system to assign a unique identifier to each chemical and merging entries on the basis of this identifier. Next the graded endpoints for Salmonella and E. coli were

Page 5: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 5 [email protected]

combined from the different sources, resulting in a database of 9,973 chemicals each with a positive/negative overall bacterial mutation call. The distribution of positive and negative data is summarized in Figure 1. The reference set also covers a diverse collection of compounds since they have been derived from many different sources, including pharmaceuticals, pesticides, industrial chemicals and food additives. This diversity is illustrated by clustering the reference set with the Leadscope software, using a cut-off distance (i.e. height) of 0.5 [19-24]. An example of 4 clusters (out of the 3,342) is presented in Figure 2. There were 1,657 clusters with two or more examples, and 1,685 singletons (clusters with one example).

Figure 1: Summary of the database endpoint content

Figure 2: Clustering of the reference sets by structural similarity

Page 6: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 6 [email protected]

Alert compilation

A bacterial mutagenicity structural alert is a molecular substructure defining a reactive center, as illustrated in Figure 3. In this figure, an aziridine substructure is shown which has been cited in multiple publications as a structural alert for mutagenicity ("aromatic and aliphatic aziridinyl" [25]; "aziridine" [26]; "oxiranes and aziridines" [27]; "SA_7: epoxides and aziridines" [28,29]). For any mutagenicity structural alert, there should be a relationship between this reactive center identified within the molecule and its ability to either directly or indirectly (through one or more metabolic steps) interact with DNA. For example, in [29] aziridines have been described as “… extremely reactive alkylating agents that may react by ring-opening reactions … activity of these compounds depends on their ability to act as DNA cross-linking agents, via nucleophilic ring-opening of the aziridine moiety by N7 positions of purines.”

Figure 3: Aziridine example alert

A number of publications exist for the proposed alerts [25-29,31-34]. The first step in the process of developing an expert alert-based system is to encode these published alerts as substructural definitions. This involves defining the one or more substructures to describe each of the alerts. Many publications also include a description of the mechanistic basis for the proposed alerts and this rational is captured alongside the structural definitions.

Next, the alerts are consolidated wherever possible; however, this process is challenging since not all publications define the same alert in the same way. This is illustrated in Figure 4 where four papers cite the alert aziridine in different ways. The Ashby 1988 paper has substitutions on both carbons and any substitution on the nitrogen is ambiguous [25], the Kazius 2005 paper places no restrictions on any attachments [26], the Bailey 2005 and Benigni 2008 & 2011 papers’ definitions include both epoxides/oxiranes together in the same definition [27-29]; however, no restrictions on any of the atoms are presented in Bailey but in the Benigni papers the nitrogen requires one attachment to any atom. The process of consolidating the alerts must take into account the different definitions and ways the alert has been defined. The plausible mechanisms are also helpful in refining the alerts’ definitions.

Figure 4: Example of alerts definitions from different papers

Page 7: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 7 [email protected]

Once these alerts have been encoded and consolidated, they are organized hierarchically, as illustrated in Figure 5. In this example, the alert 28: nitrosamine is a parent alert, with three more specific child alerts related to it. There are several reasons for doing this. First, it helps in establishing a mechanistic explanation, particularly where any child alert is lacking or has limited mechanistic information, as it may be inherited from the parent alert. Secondly, when the expert alerts are used to make prediction, a score is calculated reflecting the precision of the alert. If more than one alert matches the target compound being assessed, then only the score for the most precisely defined alert (i.e. the alert that is closest to any terminal node in the hierarchy) is displayed. Finally, while providing a summary of the alert(s) that fire is useful, a list of many related alerts is not helpful. This hierarchy can be used to select the most precise and relevant alert for a test compound for display.

Figure 5: Hierarchical organization of alerts

In addition to the primary alert, it is also important to define any factors that would activate or deactivate the alerts as a result of electronic or steric effects or by blocking an important metabolic step. For example, for the alert primary aromatic amine, an acidic group in the para position “…prevents proton abstraction from NH2 group of anilines … by the ferric peroxo intermediate of CYP1A2.” [30]. Unfortunately, there is limited information in the literature on precise factors that activate or deactivate the alerts, therefore, an exercise of data mining the reference set was undertaken to better understand these factors. This process used the informatics tools available in the Leadscope software to identify and quantitatively assess activating and deactivating factors [19-24]. Figure 6 illustrates three example deactivating fragments for the aromatic nitro alert.

Figure 6: Examples of deactivating fragments

Data was shared by a number of organizations under confidentiality agreements and used to derive knowledge for inclusion in the expert alert system. SAR knowledge was derived from this data using a

Page 8: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 8 [email protected]

series of Leadscope informatics applications and incorporated as updated alerts (including improvements for alert activation and deactivation). A SAR fingerprint methodology was used to incorporate proprietary knowledge around the structure-activity relationships, without exchanging chemical structures or data [42].

Alerts for specific chemical classes (such as primary aromatic amines or PAAs) may have a relatively high number of false positives when applied to proprietary datasets, since the underlying structural reasons for deactivation may not be fully understood or present in the reference data set or in the literature. A list of aromatic amine substructures was defined and combined into a SAR fingerprint for identifying and sharing structure-activity relationships. The number of positive/negative examples were identified over a series of compound collections and pooled across multiple organizations. This information was used to identify structural classes that activate or deactivate aromatic amine mutagenicity (as illustrated in Figure 7). This knowledge was incorporated into version 2 of the expert alert system to improve its performance for this chemical class and the results have been documented in a publication [40].

Figure 7: Pooling SAR knowledge across multiple public and proprietary collections

The literature also describes sub-classes of alerts that represent highly active subclasses. These are essentially another alert; however, they can be linked with the alerts through the hierarchical relationship as described earlier. For example the benzene, 1-amino(NH2)-, 4-aryl- is identified as a specific sub-class of the primary aromatic amines since “these chemical features make the nitrenium ion (DNA-reactive intermediate) more stable due to the electron donating property of the benzene ring, and thus more reactive with DNA” [35]. There are again limited examples in the literature so a data mining exercise of the reference set was undertaken to identify these classes [42].

As discussed earlier, in Note 2 of the ICH M7 guideline [1] it states that where it is not feasible to isolate

or synthesize the impurity or when compound quantity is limited “…bacterial tester strains … proven to

be sensitive to the identified alert” may be used. The results from this assessment may be used to assess

literature data or company historical data where data for all strains is not available, to qualify an alerting

compound quickly by performing a test using the most responsive strains, and to determine the most

responsive strains to run where there is limited material to test. A database of compounds containing

Page 9: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 9 [email protected]

bacterial mutagenicity data for all 5 standard Ames strains (TA98, TA100, TA1535, TA1537, TA102 or E.

coli WP2) was analyzed for relative strain sensitivity. The Leadscope expert alerts (version 3) were then

applied to the entire database. For the set of database structures matching each alert, the frequency,

mean, p-value, and percent of bacterial mutagenicity positives found (strain mean/bacterial mutation

mean) was calculated across each tester stain and strain combination. The responsiveness of each strain

and strain combination was calculated for each alert graded as [43]:

• Highly responsive (%positives >= 95%, mean > 0.70 and p-value < 0.015)

• Responsive (%positives > = 80% mean > 0.70 and p-value < 0.1)

• Not sufficiently responsive (all other situations)

For each Leadscope alert definition, those strains that are highly responsive for bacterial mutation are marked as “recommended” for follow-up experimental limited-strain Ames testing.

The complex relationships between the alerts, the activating/deactivating fragments and/or highly active subclasses (all organized in a hierarchy) as well the accompanying non-structural information such as the source of the alerts, the mechanistic rationale for the alert and the sensitive bacterial tester strains are encoded as an XML document [36]. The alert can also be defined as one or more “Rules” to accommodate examples such as polycyclic aromatic hydrocarbons, which require the definition of a number of unique cyclic systems to fully define all possible planar systems for this alert. This XML document is linked to the structural definitions for the alerts, deactivating fragments, and the highly active subclasses, which are defined in the MOL file format [37]. The information contained in the XML plus the structural definitions constitutes the knowledge base used by the expert alerts system. Figure 8 provides an example of an XML alert record for an alert.

Figure 8: Example of an XML record to represent an alert

Alert Assessment

The purpose of the alert assessment process is to establish which alerts should be used to make predictions. These “active” alerts should have clear evidence that they are predictive of a positive outcome in the bacterial mutagenicity assay, as required by the ICH M7 guideline. For this assessment, alerts are initially assigned as active when there are more than five examples of compounds in the

Page 10: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 10 [email protected]

reference set that match the alert and when greater than 70% of those examples are positive. However, it is not sufficient to rely solely upon the number of examples for each alert. A second assessment is performed by removing those compounds also containing other alerts. When compounds containing only the alert in question also show good correlation, then there is more convincing evidence that the alert should be assigned as active. This assessment is illustrated in Figure 9 with two examples. There is good supporting data to include the aziridine in the list of active alerts. Another example of a potential alert is furan, a proposed alert for mutagenicity from the same paper [30]. There is some evidence of an association in the reference set with 143 examples since almost 66% were positive (compared to 47% active in the entire reference set). However, if compounds containing additional alerts such an aromatic nitro, etc. are removed, there are only 33 remaining examples from the reference set with a mere 24% positive (lower than the average for the reference set). Hence, based upon this analysis there is not enough evidence to suggest furan should be included as an active alert.

Alert Number of examples (all) Precision (all)

Number of examples only contain the alert (isolated) Precision (isolated)

24: aziridine 33 1 19 1

85: furans 143 0.6573 33 0.2424

Figure 9: Alert assessment

Alerts are also classified as “inactive” for the bacterial mutagenesis assay. These alerts show little correlation with positive data values. In these cases a cut-off of five examples with less than 50% positive, (which is the approximate percentage of active/inactive in the entire reference set), is used.

There is a final category which is assigned to the remaining alerts – “indeterminate”. There are a number of reasons an alert might be categorized as indeterminate. For example, alerts with insufficient examples to classify them as either active or inactive would fit into this category. Alternatively, the number of positive examples may be between the active and inactive cut-off (50% - 70% positives). This might happen when there is little data present concerning deactivating factors or highly active subclasses. Another group of indeterminate alerts exist when additional information is present which questions the reliability of the alert. For example, it has been reported that acid halides show positive Ames data only in the presence of and as a consequence of the use of DMSO [38]. Hence, this additional knowledge could be used to assign them as indeterminate. Although, indeterminate alerts are not used to make any positive calls, they may be discussed in a subsequent expert review. Since the alerts are hierarchically organized, it is possible to have a mixture of active, inactive and indeterminate alerts at various levels of the hierarchy which illustrates another benefit of this hierarchical organization.

All structural definitions used to define the alert rules (including those responsible for both activation and deactivation), were run against a series of proprietary databases, including pharmaceutical companies proprietary collections. The number of positive and negative examples were identified for each collection and then pooled across all sources. This information was in turn used to further refine the alerts. This process included changing the status (active, inactive, indeterminate) of any alert as well as refining the structural definition of the alerts, including activating and deactivating features (as illustrated in Figure 10).

Page 11: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 11 [email protected]

Figure 10: Refining alerts based on results from both public and proprietary databases

Classification and scoring

A positive call is made for compounds within the applicability domain where an active alert matches the test compound and there are no deactivating fragments present (in the specified relationship to the primary alert). In Figure 11, the two chemicals presented would be predicted as positive. Chemical A matches an alkyl hydrazines,terminal non-aryl hydrazine, and aliphatic terminal hydrazine alerts (highlighted in red) and chemical B matched both an aromatic nitro as well as an aziridine. Compound A has a precision score is 1.0 (based on the highest alert precision value). Similarly compound B has a precision value associated with the aziridine (0.97).

Figure 11: Examples of positive results

A negative call is made when a test compound contains an active alert with a deactivating feature present that specifically deactivates the alert. A negative call is also made where no alert is identified and the test compound is within the applicability domain. In Figure 12, both chemicals C and D are predicted negative. In chemical C, an aromatic nitro is present (shown in gray); however, there is a corresponding deactivating fragment close to the aromatic nitro. In chemical D, no alerts are identified. A prediction for both chemicals can be made because they are in the applicability domain of the alerts specifically they have at least one chemically similar neighbor in the reference set.

Page 12: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 12 [email protected]

Figure 12: Examples of negative results

It is essential not to predict novel chemical classes if they have no supporting historical data. Domain assessment is performed using a structure similarity search against the reference set (based upon the Leadscope structural fingerprint and a Tanimoto distance) [19-24]. This ensures that negative predictions are not extrapolated to novel areas of chemistry where there is no Ames data. An indeterminate call occurs only when an alert marked as indeterminate is identified yet when the compound is still in the domain of applicability. Figure 13 illustrates an indeterminate result (E) and a not-in-domain result (F).

Figure 13: Examples of out-of-domain and indeterminate results

Accompanying the positive or negative classification is a precision score indicating confidence in the prediction. This score reflects positive mean example compounds from the reference set (or proportion

Page 13: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 13 [email protected]

of positives). When multiple alerts match the test compound, the most precise or highest score is selected. When a negative classification is made, the score is reflective of the background positive mean for all reference set compounds containing no active alerts. In Figure 9, chemical A matches a terminal non-aryl hydrazine which is 100% positive based upon the examples in the database. Compound B in Figure 11 matched two alerts and the alert with the highest precision value is selected, in this case aziridine. In Figure 12 compounds C and D both have a precision score of 0.1613 which is the positive mean for all compounds that do not contain an alert in the reference set.

Consensus rules for ICH M7

The ICH M7 guidance recommends that two in silico methodologies are used in the assessment of impurities; however, if data is available it can also be used. Thus a complex series of possibilities exist for generating an overall final call. Generally, the availability of data will take precedence over a prediction. If one or more prediction method is positive, in the absence of data related the predicted endpoint, the overall call is positive.

Results

The Leadscope Genetox Expert Alerts have been implemented as part of the Leadscope Model Applier v2.1 alongside the existing statistical-based QSAR models. To compare the performance of version 1 (released March 2014), 2 (released April 2015) and 3 (released April 2016) of the alert system, three sets of compounds were combined into a validation set:

9,789 compounds from the Leadscope expert alerts reference set

4,021 compounds from the National Institute for Health Science, Japan

6,512 compounds from the Hansen set [39]

Duplicates across these three collections were removed resulting in a final validation test set of 14,404 compounds. Figure 14 compares the performance using this validation set for version 1, 2 and 3 of the expert alerts.

Alerts Leadscope Genetox Expert Alerts v3

Leadscope Genetox Expert Alerts v2

Leadscope Genetox Expert Alerts v1

True positive 4,527 4,536 4,470

False negatives 966 1,029 1,075

True negatives 7,014 7,062 6,602

False positives 1,001 1,060 1,598

Concordance 85.4% 84.7% 80.6%

Sensitivity 82.4% 81.5% 80.6%

Specificity 87.5% 86.9% 80.5%

Positive predictivity 81.9% 81.1% 73.7%

Negative predictivity 87.9% 87.3% 86.0%

Figure 14: Predictivity of v1 and v2 of the expert alerts

In version 2 and 3 of the Genetox Expert Alerts sharing data and knowledge derived from proprietary databases improved the performance and for certain structural classes expanded the domain of applicability for these models. The number of false positives in versions 2 and 3 decreased by over a

Page 14: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 14 [email protected]

third versus version 1 through the addition and refinement of deactivating alerts - resulting in a corresponding performance increase in specificity and positive predictivity. The number of false negatives was also reduced slightly though the addition of new alert rules targeting specific subclass activity – resulting in a slight performance increase in sensitivity and negative predictivity. Overall accuracy improved by approximately 5% to over 85% since version 1.

The coverage of version 2 and 3 alerts using this set was similar to that in version 1. The number of out-of-domain predictions dropped and the number of indeterminate alerts present increased slightly due to an increased understanding of structure-activity relationships for certain classes.

The Leadscope expert alerts were independently reviewed in a recent 2014 poster presented by the US FDA and it was concluded that “Overall, the expert alert system showed high sensitivity and provided adequate supporting evidence to understand a given prediction, making it suitable for application under ICH M7. Additionally, it should be noted that this expert system generates negative predictions using a reference set of 7112 number of chemicals.” [41]

Discussion

ICH M7 guidelin outlines for regulators and pharmaceutical sponsors the process for qualifying a drug impurity as having no mutagenicity concern. As part of this process a computational analysis can be performed to qualify certain classes of impurities as negative. The guideline also states that any in silico system should adhere to the OECD validation principles where (Q)SAR models should have: “1) a defined endpoint, 2) an unambiguous algorithm, 3) a defined domain of applicability, 4) appropriate measures of goodness-of-fit, robustness and predictivity, 5) a mechanistic interpretation, if possible.” The expert alert-based system described here was built to specifically make predictions for the bacterial mutagenicity endpoint following the ICH M7 guidance. The algorithm has been outlined in detail including the methodology used to assess the applicability domain. The results from an external validation have been presented which summarizes the system’s goodness-of-fit, robustness and predictivity. Finally, when any alert matches a target test compound, a summary of the literature-derived mechanistic basis for each matched alert is provided.

One of the guiding principles in the development of this system was transparency of the predictions. This is implemented by linking the alerts to a quantitative assessment based on a collection of mutagenicity data. This not only provides a level of confidence in the results but also supports expert judgment that may accompany the results. The alerts are defined such that it is possible to immediately answer questions such as: which literature sources did the alert come from? what is the alert structure definition? how is the alert deactivated? what tester strains are the most sensitive? do any mechanistic justifications exist for the alert? and what data supports the use of the alert? (see Figure 15 and Figure 16). The system also highlights the presence of any indeterminate alert for discussion as part of an expert review.

Page 15: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 15 [email protected]

Figure 15: Explanation of alert firing

Figure 16: Details of historical data for the alert

The system also provides functionality to identify any potentially ad hoc reactive groups present in the impurity by querying the reference set as a database. This is used when writing an expert opinion when no active alert is present in order to justify there are no additional potentially reactive groups not accounted for by the alerts. A chemical substructure analysis is performed (using the Leadscope features), based on chemical fragments present in both the impurity and the reference database. A substructure search of the whole impurity is also performedThe number of positive and negative examples are shown with each substructure. When any substructure has significantly higher than expected number of positives, i.e. the subset’s mean is significantly greater than the mean for the entire database, then the substructure may represent a potentially reactive group if that group is present in a similar chemical environment. Figure 17 illustrates an analysis of potentially reactive groups.

Page 16: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 16 [email protected]

Figure 17: Analysis of other potentially reactive features

For each individual alert it is also possible to identify responsive bacterial tester strains, as shown in Figure 18. The first example matches alert #73 (1,1,-dihaloalkanes) and in the alert definition it is shown to be responsive in TA100 alone (identifying 96% of the positives). In the second example, the compound matches alert #267 (aromatic amine(NH2) (strong activating anilines)). In the alert definition, the combination of TA98 and TA100 was identified as the most sensitive strain identifying over 99% of the positives.

Figure 18: Identification of responsive bacterial tester strains per alert

Page 17: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 17 [email protected]

Using a series of rules, the system also combines results from both an expert alert-based system and statistical-based QSAR models to generate an overall consensus call to use in the assessment of the ICH M7 results. The ICH M7 guideline includes language stating that: 1) when an API has been tested with negative results, but 2) an impurity was predicted positive and 3) both the API and the impurity share a positive alerting fragment, then 4) it is possible to argue that the impurity could be classified as negative. By highlighting the components of both the statistical-based model as well as the expert-alerts based model that triggered the positive results, it is possible to readily make this expert review, as illustrated in Figure 19.

Figure 19: Looking at the weight of evidence

Conclusion

This paper describes an expert rule-based system designed to support the ICH M7 guideline for drug impurities. The current work details the process of developing an alert knowledge base as well as a supporting reference set used to assess the alert rules. These alert rules are based on well-defined mutagenicity structural alerts from the literature which have been further refined to include additional activating/deactivating factors as well as active subclasses (which represent Increased concern). The paper has also described how predictions are generated by the alert system and the various factors influencing the final prediction. The system does not extrapolate to areas of chemistry it has never seen by using an unambiguous applicability domain assessment. The good validation results and its adherence with the ICH M7 guideline and OECD validation principles allows this system to be used in the regulatory assessment of impurities with confidence.

Acknowledgements

We would like to thank Dr. Naomi L. Kruhlak and Dr. Lidiya Stavitskaya from the US FDA who, through the Research Collaboration Agreement with Leadscope have provided Leadscope with access to the RCA-QSAR training data set as well as valuable input into the system. We would also like to thank Dr. Errol Zeiger and Dr. Ron Snyder for their expert advice as well as the beta testers from the pharmaceutical industry and consultants to the pharmaceutical industry. We would also like to thank all the organizations that participated in the SAR fingerprinting exercises[40].

References

[1] Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk (M7)

(http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Multidisciplinary/M7/M7_Step_4.pdf accessed: August 28th, 2015).

Page 18: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 18 [email protected]

[2] OECD guideline for testing of chemicals: Bacterial Reverse Mutation Test 21st July 1997

(http://www.oecd-ilibrary.org/environment/test-no-471-bacterial-reverse-mutation-test_9789264071247-en accessed March 5th 2014).

[3] OECD (2007). Guidance document on the validation of (Quantitative) structure activity relationships [(Q)SAR] models.

(http://ihcp.jrc.ec.europa.eu/our_labs/predictive_toxicology/background/oecd-principles accessed March 5th 2014).

[4] Benigni R, Bossa C, Jeliazkova N, Netzeva T & Worth A (2008b). The Benigni / Bossa rulebase for mutagenicity and carcinogenicity – a module of Toxtree. EUR 23241 EN. http://ecb.jrc.ec.europa.eu/qsar/publications/.

[5] Smithing MP & Darvas F (1992). Hazardexpert: an expert system for predicting chemical toxicity. In: Food Safety Assessment (JW Finlay, SF Robinson and DJ Armstrong). Washington, American Chemical Society: 191-200.

[6] Ridings JE, Barratt MD, Cary R, Earnshaw CG, Eggington CE, Ellis MK, Judson PN, Langowski JJ, Marchant CA, Payne MP, Watson WP & Yih TD (1996). Computer prediction of possible toxic action from chemical structure: an update on the DEREK system. Toxicology 106(1-3), 267-279.

[7] Sanderson DM & Earnshaw CG (1991). Computer prediction of possible toxic action from chemical structure; the DEREK system. Human and Experimental Toxicology 10, 261-273.

[8] Woo Y-T & Lai DY (2005). OncoLogic: A mechanism-based expert system for predicting the carcinogenic potential of chemicals. In Predictive Toxicology (C Helma, ed), 385-413.

[9] Michael R. O’Donovan, Christine D. Mee, Simon Fenner, Andrew Teasdale, David H. Phillips Boronic acids—A novel class of bacterial Boronic acids—A novel class of bacterial Mutagen, Mutation Research/Genetic Toxicology and Environmental Mutagenesis, Volume 724, Issues 1–2, 18 September 2011, Pages 1–6.

[10] L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, SOT 2013 “ Development of Improved Salmonella Mutagenicity QSAR Models Using Structural Fingerprints of Known Toxicophores”.

[11] L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, “Development of Improved QSAR Models for Predicting A-T Base Pair Mutations “,GTA 2013b poster.

[12] Leadscope SAR Genetox Database 2013 User Guide.

[13] Yang C, Hasselgren CH, Boyer S, Arvidson K, Aveston S, Dierkes P, Benigni R, Benz RD, Contrera J, Kruhlak NL, Matthews EJ, Han X, Jaworska J, Kemper RA, Rathman JF and Richard AM. Understanding Genetic Toxicity Through Data Mining: The Process of Building Knowledge by Integrating Multiple Genetic Toxicity Databases, Toxicology Mechanisms and Methods, 2008, 18:2, 277 – 295.

[14] Food and Drug Administration CFSAN EAFUS. 2007. U.S. FDA Center for Food Safety and Applied Nutrition, Everything Added to Food in the United States. (http://www.fda.gov/food/ingredientspackaginglabeling/foodadditivesingredients/ucm115326.htm Accessed March 9, 2014).

Page 19: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 19 [email protected]

[15] FDA CDER FOI site: http://www.accessdata.fda.gov/scripts/cder/drugsatfda. Accessed March 9, 2014.

[16] CCRIS: http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CCRIS Accessed March 9, 2014.

[17] R W Tennant The genetic toxicity database of the National Toxicology Program: evaluation of the relationships between genetic toxicity and carcinogenicity. Environ Health Perspect. 1991, 96: 47–51.

[18] Tokyo-Eiken. 2007. Tokyo Metropolitan Institute of Public Health, Mutagenicity of food additives.

[19] Roberts, G.; Myatt, G.J.; Johnson, W.P.; Cross, K.P.; Blower, P.E.; Leadscope: Software for Exploring Large Sets of Screening Data. J. Chem. Inf. Comput. Sci., 2000, 40, 1302-1314.

[20] Johnson, D.E.; Blower, P.E.; Myatt, G.J.; Wolfgang, G.; Chem-tox informatics: Data mining using a medicinal chemistry building block approach. Current Opin. Drug Disc. Develop., 2001, 4(1), 92-101.

[21] Cross, K.P.; Myatt, G.; Yang, C.; Fligner, M.F.;Verducci, J.S.; Blower, Jr., P.E. Finding Discriminating Structural Features by Reassembling Common Building Blocks, J. Med. Chem., 2003, 46, 4770-4775.

[22] Blower, P.; Cross, K.; Fligner, M.; Myatt, G.; Verducci, J.; and Yang, C. Systematic analysis of large screening sets, Curr. Drug Disc. Technol., 2004, 37-47.

[23] Yang, C.; Cross, K.; Myatt, G.; Blower, P.; Rathman, J. Building Predictive Models For PTP1B Inhibitors Based On Discriminating Structural Features By Reassembling Medicinal Chemistry Building Blocks, J. Med. Chem., 2004, 47, 5984 – 5994.

[24] Blower, P.E.; Cross, K.C.; Eichler, G.S.; Myatt, G.J.; Weinstein, J.N.; and Yang, C. Comparison of Methods for Sequential Screening of Large Compound Sets, Comb. Chem. High Throughput Screen., 2006; 9:115-22.

[25] Ashby J, Tennant RW. Chemical structure, Salmonella mutagenicity and extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222 chemicals tested by the U.S.NCI/NTP Mutat. Res. 1988, 204:17-115.

[26] Bailey AB, Chanderbhan N, Collazo-Braier N, Cheeseman MA, Twaroski ML. The use of structure-activity relationship analysis in the food contact notification program. Regulat Pharmacol Toxicol 2005; 42:225-235.

[27] Benigni R, Bossa C, Jeliazkova N, Worth A. The Benigni / Bossa Rulebase for Mutagenicity and Carcinogenicity - A Module of Toxtree - EUR 23241 EN 2008.

[28] Benigni R & Bossa C, Mechanisms of Chemical Carcinogenicity and Mutagenicity: A Review with Implications for Predictive Toxicology, Chem Rev. 2011 Apr 13;111(4):2507-36.

[29] Ciaravino V, Plattner J, Chanda S, An Assessment of the Genetic Toxicology of Novel Boron-Containing Therapeutic Agents, Environmental and Molecular Mutagenesis 54:338-346 (2013).

[30] Enoch SJ & Cronin MTD, A review of electrophilic reaction chemistry involved in covalent DNA binding, Critical Reviews in Toxicology, 2010, 40, 728-748.

[31] S. Enoch, M.T.D. Cronin Development of new structural alerts suitable for chemical category formation for assigning covalent and non-covalent mechanisms relevant to DNA binding , Mutation Research 743 (2012) 10-19.

Page 20: Leadscope® Genetox Expert Alerts Version 3.0 White Paper An …€¦ · that two distinct in silico methodologies can be used to qualify certain drug impurities as non-mutagenic

An expert alert system to predict the mutagenic potential of impurities to support the ICH M7 guideline May 2016

www.leadscope.com 20 [email protected]

[32] Kazius J, McGuire R, Bursi R. Derivation and Validation of Toxicophores for Mutagenicity Prediction. J. Med. Chem., 2005, 48: 312-320.

[33] OECD Report No.120 Report of the Expert Consultation on Scientific and Regulatory Evaluation of Organic Chemistry Mechanism-Based Structural Alerts for the Identification of DNA Binding Chemicals Part 1 and Part 2 (2010).

[34] Shamovsky I et. al. J Am Chem Soc. 2011 Oct 12;133(40):16168-85.

[35] Galloway S.M. et al. Regulatory Toxicology and Pharmacology 66 (2013) 326–335.

[36] Harold ER, Means WS (2009), XML in a Nutshell 3rd edition, O'Reilly Media.

[37] Dalby, A, Nourse, JG, Hounshell, WD, Gushurst, AKI, Grier DL, Leland BA, Laufer, J. (1992). "Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited". Journal of Chemical Information and Modeling 32 (3): 244.

[38] Amberg A, Harvey JS, Czich A, Spirkl HP, Robinson S, White A, and Elder DP Do Carboxylic/Sulfonic Acid Halides Really Present a Mutagenic and Carcinogenic Risk as Impurities in Final Drug Products? Organic Process Research & Development 2015, DOI: 10.1021/acs.oprd.5b00106.

[39] Hansen K. et al., J. Chem. Inf. Model. 2009, 49:2077-2081.

[40] Ahlberg E, Amberg A, Beilke LD, Bower D, Cross KP, Custer L, Dobo K, Ford KA, Van Gompel J, Harvey J, Honma M, Jolly R, Joossens E, Kemper R, Kenyon M, Kruhlak N, Kuhnke L, Leavitt P, Neilan C, Naven R, Quigley DP, Shuey D, Sprikl HP, Stavitskaya L, Teasdale A, White A, Wichard J, Zwickl C, Myatt GJ, Ensuring regulatory acceptable (Q)SAR models reflect proprietary chemical space for aromatic amine mutagenicity, submitted to Regulatory Toxicology and Pharmacology.

[41] Stavitskaya L, Minnier BL, and Kruhlak NL, Benchmarking Assessment of Open Source and Newly Released Salmonella Mutagenicity (Q)SAR Models for Potential Use Under ICH M7, SOT 2014 Poster Abstract #2273b.

[42] Amberg, A., Beilke, L., Bercu, J., Bower, D., Brigo, A., Cross, K.P., Custer, L., Dobo, K., Dowdy, E., Ford, K.A., Glowienke, S., Gompel, J.V., Harvey, J., Hasselgren, C., Honma, M., Jolly, R., Kemper, R., Kenyon, M., Kruhlak, N., Leavitt, P., Miller, S., Muster, W., Nicolette, J., Plaper, A., Powley, M., Quigley, D.P., Reddy, M.V., Spirkl, H.-P., Stavitskaya, L., Teasdale, A., Weiner, S., Welch, D.S., White, A., Wichard, J., Myatt, G.J., 2016. Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses. Regulatory Toxicology and Pharmacology 77, 13–24.

[43] Cross, K.P., Myatt, G.J., Hasselgren, C., Mitigation of Bacterial Mutation (Q)SAR Alerts using Limited-Ames and Tester Strain Data, SOT 2016 Poster Abstract #3738.