a fast computer program for conformational analysis

5

Click here to load reader

Upload: giorgio-castellani

Post on 21-Jun-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A fast computer program for conformational analysis

A FAST COMPUTER PROGRAM FOR CONFORMATIONAL ANALYSIS

GIORGIO CASTELLANI and RAIMONW SCORDAMAGLIA Istituto Ricerche “Guide Donegani”, Via G. Fauser 4, 28100 Novara, Italy

(Received I8 Jammy 1983; in revised form 6 July 1983)

Abstract-We describe a program, caned SOBALCAN, that enables us to compute very quickly the rotation hypcrsurfacc of a flexible molecule in the “soft-sphere” approximation. Only thoseconformations in which the overlap of spheres associated to all pairs of non-bonded atoms is less than a prefixed threshold are computed. _

INTRODUCTION

Conformational freedom plays an important role in the study of structure-activity relationships of drugs: since the rotation of atomic groupings about single bonds is more or less hindered, molecules can adopt a number of shapes, often very different from each other, of which only a few can interact with the receptor site. This latter, in turn, is usually assumed to have a well-defined geometry, resulting from the bal- ance of intramolecular interactions, and the changes it undergoes when interacting with the active mole- cules are regarded as fairly small and their energetic cost negligible.

If one wants to determine which requirements mol- ecules should satisfy in order to be active, their steric properties must, therefore, be known. No detailed information on the receptor structure is generally available; in most cases, even the specific receptor for certain molecules is unknown. One then has to infer the receptor form by studying the steric features of a series of molecules which, on experimental basis, are known to possess a certain kind of activity. The rationale underlying this approach is that, if con- generic molecules interact with the same receptor site, they must share some steric properties.

The assessment of common features is not a simple task when dealing with classes of molecules having a large number of rotational degrees of freedom; in principle, all accessible conformations should be con- sidered. If no information enabling one to limit his search is at hand, some simplifying assumptions have to be made: the most frequently used consists of comparing molecules in their minimum-energy con- formations (as determined through theoretical com- putations) or in the crystallographic conformations, when known. However, this might be a poor assump- tion for flexible ligands. Only the achievement of satisfactory relationships with activity for certain molecules or, hopefully, of correct predictions for new molecules, give a confirmation aposteriori of the validity of this approach.

From literature we see that unfortunately none of the numerical methods of energy potential function minimization can solve the problem of finding the global minimum in a definite way. The most obvious one is to calculate conformational energy at a number of points in conformational space chosen either at random or over a fixed grid of values of independent variables and finally compare the energy values at various local minima (McCornick, 1972). In prin-

ciple, this method requires an enormous number of energy evaluations. But the problem can be reduced to a tractable size eliminating certain regions of conforrnational space on the basis of a study of molecular models and to perform the minimum search only in selected subspaces (Lewis er al., 1973). Another way, if you have experimental evidence, for example NMR spectra available, is to automatically analyse the raw data by searching for conformations that are possible sterically and which agree with experimental data. In this case the conformation assumed you can consider the ensemble average seen by the physical technique (Barry er al., 1971). To do this either interactive methods are described, which are able to take advantage of the power and ver- satility of the “human computer” for pattern recog- nition and inductive thinking, or automatic ones, which define some criterion function to describe the acceptability of a conformation which can then be optimized (Barry et al., 1974). Particularly in large molecules, one of the most important problems is that of false minima. Multiple local minima are quite common and a local minimum may have a higher statistical weight than the global one and may there- fore correspond to the actual predominant con- formation of a molecule (Scheraga, 197 1).

Our present effort is directed both to reach auto- matically local minima in conformational hyper- surface and to define regions in the variables space in which we can perform further energy optimization.

The possibility of yielding and checking all con- formations a molecule can assume appears at first sight beyond the present possibilities of computation. The exploration of the whole conformational space for a molecule with rn degrees of rotational freedom (m rotatable bonds), in steps of 360”/n for each rotation, requires the computation of n” con- formations: even in the simple case of m = 5 and n = 12, the number of possible conformations (248, 832) is excessive even with the fastest algorithms.

Only a fraction of conformations which are possi- ble from the geometric point of view are acceptable from the energy point of view. But is there any means to avoid computing the energy of those con- formations which can be regarded with certainty as high-energy ones? In other words, if N, is the total number of molecular conformations and N, (in gen- eral < NJ is the number of conformations whose non-bonded energy is lower than a prefixed thresh- old, is it possible to reduce the number N, of actual

127

Page 2: A fast computer program for conformational analysis

128 G. CASTELLANI and R. SCOFUXMAGLIA

computations to a value as close as possible to N., by establishing criteria such that N, - N, (the number of conformations which can be safely discarded without computing their energy) is as high as possible?

We have developed a computer program which allows us to give an affirmative answer to this ques- tion. It has been called SOBALCAN (Soft-Ball Con- formational Analysis) and its description forms the subject of the present paper.

GENERATING CONFORMATIONS

A starting molecular conformation is constructed by means of the STERIMOL program (Verloop et nl., 1976). The internal rotation angles are chosen by manipulating Dreiding (or equivalent) stereomodels into “plausible” shapes, i.e. in such a way that a visual inspection reveals the absence of any bad contacts between nonbonded atoms. If the molecule contains atoms whose structure cannot be re- produced properly with the data stored in STER- TMOL, we use the program BAC (Builder of Atomic Coordinates), which computes the internal coordi- nates of atoms with any desired geometry (Castellani & Scordamaglia, 1980).

The molecule is then divided into m + 1 groups. Each group may be regarded as a rigid body; should unacceptable contacts between non-bonded atoms exist within it, none of the inter-group rotations would be capable of restoring a good geometry. Thus the program makes a preliminary check on the above interactions, and stops if bad contacts are detected.

In the next step interactions between neighboring groups are considered, by performing the M rotations about the bonds connecting groups. From this anal- ysis we obtain a m x n rectangular matrix which contains the allowed and the non-allowed angular ranges for each rotation (see next section for the criteria of acceptability). The forbidden angular ranges are no more taken into consideration by the program, since those contacts between non-bonded atoms which cause a range to be put aside depend only on the dihedral angle considered and cannot be removed by rotating the others. The overall com- putational gain, i.e. the number of conformations which need no more to be examined, is of n”- ’ for each range removed.

At this point the program makes a control to detect all possible rotational symmetries, and reduces the angular range accordingly: for example, for a methyl group, the angular range explored is 120” instead of 360”.

Subsequently all angles are rotated sequentially, according to the following scheme. Let us indicate by

0 rp:, P2r I.. I cp; the starting values of internal rota- tion angles.

First a complete rotation of (P,,, is made, i.e. the conformations

are computed; their number is less than n if, in the single rotation of (Pi, one or more forbidden ranges had been detected, and equal to n otherwise.

Then the angle qz _ , is given successive increments of 360/n, and each time the whole rotation of (Pi is repeated. When the cycle on 50, _ , is completed (after doing n2 computations at most), we give successive increments of 360/n to ~z_~, and for each of these (up to) n conformations we perform the coupled rotation [cp,, cp,_ J, thus reaching a maximum of n3 computations.

The procedure is then repeated for q,,_ 3r rp,-4,. . > ‘PI, thus covering the whole con- formational space with a multidimensional grid,

During the above process a large number of com- putations leading to predictably unacceptable con- formations may be avoided through the “skipping” process illustrated below.

Suppose that, when doing rotation q3, we detect, at &, a contact between two atoms belonging to groups G, and G,, resp. It is clear that rotations rp,, (p5 and ‘ps cannot remove this contact. Computation of the whole rotations q5 and ‘ps at (p, would be a waste of time, and so we can save (up to) n2 - 1 computations. If the rotation about the bond G,-G, had had a sequential number higher than 3, for example if we had called (p4 this rotation and cp, the rotation about G,-G,, we would have been able to save (up to) n’ - 1 computations. This proves that the choice of the order in which internal rotations are performed may differ from the sequence of rotatable bonds along the backbone chain: the more well-chosen that order, the smaller the value of N,. As an exampk in Table 1 we report the number of total calculated conformations N, obtained rotating three angles cp,, ‘Pi, ‘Pi and (Pi. ‘p5, q, (see Fig. I) in separated calculations respectively, of NRDC 161, an in- secticide of cyclopropane carboxylate class, in all possible different sequences. Obviously the number of accepted conformations for a threshold of compen-

Table 1.

No. of calculated conformations N,

107 252 107 192 252 204

122 192 122 192 192 192

Page 3: A fast computer program for conformational analysis

A fast computer program for conformational analysis

Fig. 1. Molecule of NRDC 161 (a - cyano - 3 - phenoxybenzyl cis - 2,2 - dimethyl - 3 - (2,2 - dibromovinyl)cyclopropane carboxylate from X-ray crystal structure analysis (Owen, 1975), with the

indication of the seven internal rotations dealt with by SOBALCAN.

129

etrability of non-bonded spheres of 20% does not change with the chosen order and is fixed to 48, in the first case, and 106 in the second.

The number of conformations which can be dis- carded in this way is timmk, where k is the sequential number of the rotation for which the skip occurs, for each angular range skipped.

Until now we have spoken of conformations, and shown the devices to reduce the number of the computed ones. But it should be borne in mind that “to compute a conformation” actually means to compute a number of interatomic distances of the order of the square of the number of atoms contained in the moIecule. A careful consideration of the non- bonded interactions allows us to further reduce the computational time. First of all, only the so-called non-bonded active interactions (Tosi & Lipari, 1981), i.e. interactions between atoms whose reciprocal dis- tance is affected by a rotation, are taken into consid- eration. Secondly, the indices corresponding to the interacting atoms are collected into matrices (one matrix per internal rotation angle), of which the rows are formed by the atoms in the groups before the rotated bond, and the columns by the atoms in the groups after rotated bond. With this ordering, we give priority to those interactions which, if above the acceptability threshold, allow us to determine the internal rotation angle for which the largest possible number of computations are saved.

By using these criteria we were able to reduce the number of computed conformations by approxi- mately an order of magnitude, and to keep the CPU time to a few minutes on our UNIVAC 1100/20 computer for each of the molecules reported in Table 2. These molecules are insecticides of the cyclo- propane carboxylate ester class, synthesized by Elliott (Elliott et al., 19741: they have the general formula X,C = CHCHC(CH,),CHC(O)OCHRPhOPh, with different isomers of the ethylenic part with respect to the carboxylic group (see also Fig. I for an indication of the five internal rotations considered).

We see from Table 2 that, on the average, the number iv, of computed conformations is about l/6 of the total number of possible conformations N,. In turn, the number n, of accepted conformations is about l/3 of N,. In actual fact, the difference between “computed” and “accepted” conformations is much

Table 2. Conformational analysis with SOBALCAN of a series of cyclopropane carboxylate esters

NRDC Isometry X R N, iV, 3 3 No.? N, N<

143 tram Cl H 23232 56506 0.227 0.411 157 cis Br H 10637 29000 0.117 0.367 158 trans Br CN 17332 49893 0.201 0.347 161 CiS Br CN 8056 26324 0.106 0.306 163 trans Br H 22954 565.51 0.227 0.406 166 trans Cl CN 17625 49974 0.201 0.353 167 cis Cl H 10883 29458 0.118 0.369 168 cis Cl CN 8216 26726 0.107 0.307 169 trans F CN 21171 58248 0.234 0.363 170 cis F CN 9164 28200 0.113 0,325 171 trans F H 27782 65809 0.264 0.422 172 cis F H 12081 31061 0.125 0.389

TNRDC stands for National Research Development Company, the English institution from which these data come.

smaller than it appears from this comparison. A conformation is accepted when the summation of overlaps, extended over all pairs of non-bonded atoms, does not exceed a global threshold (see next section), chosen in advance by the program user. A conformation is computed, but not accepted, when at least one contact between non-bonded atoms higher than a single threshold is detected: clearly this de- tection can occur at any time during the examination of non-bonded interactions, and as soon as this happens the conformation is rejected and the remain- ing interactions are not considered.

THE SOFT-SPHERE APPROXLMATION

According to the well-known Pauling’s model, at- oms may represented as spheres which, on penetrating into each other, give rise to bonds, the penetration degree increasing from single to double to triple bonds. The radii of the spheres, typical for any atom, are the van der Waals radii of the corresponding chemical species, as defined, e.g. by Bondi (1964).

The criterion we chose to decide on the acceptability of conformations is the degree of interpenetration of the spheres centered on any pair of non-bonded at- oms. Whenever two non-bonded atoms (i.e. two at-

Page 4: A fast computer program for conformational analysis

130 G. CASTELLANI and R. SCORDAMAGLIA

16 2 z 14

3 12 x

10

a

6

4

2

0

c

: 0

-

>

0

- \ /

90 180 -90

; y+JY+j d 90 180 -90 0

16 nl

14 2

12 ii x

10

8

6

4

2

0

Fig. 2. PCILO energies, in kcal mole- ‘, for rotations about the bonds adjoining the cyclopropane ring in crysanthemic acid, and maximum fractional overlap between non-bonded atoms (open circles).

oms at three or more bond distance) approach one another closer than a certain limit, the molecular en- ergy increases very rapidly and its largest part is due to this interaction. A geometric parameter capable to describe this phenomenon in a satisfactory way is the overlap between the spheres associated to the two atoms. The absolute value of the overlap has the disadvantage of dealing with small atoms (i.e. atoms with short van der Waals radius) and large atoms (i.e. atoms with long van der Waals radius) in the same way. This does not reflect the trend of the energy curves with the interatomic distance: in fact large atoms, due to the greater polarizdbility of their outer- shell electrons, can approach closer before energy increases to unacceptable values. A better option proved to be the $-actionnl overlap, expressed as 0, = 1 - f&&r0 + rh) (& < r, + rb), as its plot versus internal rotation angles was fairly similar to that of the energy curve computed with quantum-mechanical procedures: see for example Fig. 2, in which we com- pare the mrtximum fractional overlap between non- bonded atoms with the energies of crysanthemic acid conformations corresponding to rotations around the two bonds adjoining the cyclopropane ring, computed with the quantum-mechanical semirigorous method PCILO (Castellani et a!., 1980).

The fact that, when two atoms approach one an- other too close, the predominant contribution to in- tramolecular energy arises from their repulsion, allows us to discard a conformation as soon as we detect an interaction higher than a “single” threshold, and to avoid any further test. A molecular conformation may be regarded as unacceptable also when many local strains are present, i.e. strains which, if considered separately, would be tolerated, but become un- acceptable if considered together. We, therefore, define also a “global” threshold, and discard a con- formation when the sum of two or more overlaps exceeds it.

The choice of the threshold above which a con- formation should be discarded is critical and depends to a large extent on the subsequent use of the results of our conformational analysis, Should we take o, - 0, the model would coincide with the hard-sphere one, in which no overlap between the spheres is admitted; most conformations are then discarded. If,

on the other hand, we take o,m 1, all geometrically possible conformations are computed and accepted, with a large increase of the computer time. In choos- ing a value of of intermediate between these two extremes, many criteria may be applied. First it is necessary to decide at which energy value a con- formation is regarded as unacceptable; from this, one easily finds the corresponding overlap. On the other hand, the inherent approximations of the method should be taken into account. For example, bond stretching and bond angle deformation are neglected, while it is known that relaxation of all internal degrees of freedom of a molecule can provide easier paths for conformational interconversion, by reduc- ing the overlaps between pairs of nonbonded atoms. One should also bear in mind that the computing time increases with increasing the threshold and with decreasing the width of the angular ranges. It is thus necessary to find a good compromise between these contrasting factors, without losing the significance of results. In our correlation studies we obtained satis- factory results with n = 12 (steps of 30”)) a single overlap of= 0.2 and a global overlap of 30~ If the interest in using this program is limited to find the energy minima, a threshold of 0.05 may be sufficient.

CONCLUDING REMARKS

The methodology used in SOBALCAN does not imply the direct computation of intramolecular en- ergy; this energy, though, is implicit in the concept of mutual penetration of van der Waals spheres centered on non-bonded atom pairs. Because of this character- istic, the method is more suited to ascertain the “density” of conformational states accessible in the various angular regions, than to determine the stability of a given conformation.

It is necessary to point out that coulombic inter- actions, and in particular hydrogen bonds, which can substantially modify the molecular energy, are not considered in this version of the program. So, the use of SOBALCAN is discouraged when dealing with molecules where these effects play an important role (e.g. ions, or molecules with strong intramolecular hydrogen bonds).

To obviate these drawbacks an improved version of the present program, which we plan to describe in

Page 5: A fast computer program for conformational analysis

A fast computer program for conformational analysis 131

a subsequent paper, has therefore been worked out; in this new program (ERHYCA, which stands for Entire Rotational Hypersurface Conformational Analysis) we have maintained the same algorithms of SOBALCAN to save computer time, but the acces- sibility of conformations is evaluated by means of a potential energy function.

REFERENCES

Bondi, A. (1964). 1. Phys. Chem. 68. 441. Castellani, G., Scordamaglia, R. C Tosi. C. (1980), Gazz.

Chim. Ital. 110, 457. Castellani, G. & Scordamaglia, R. (1980), EUCHEM

Conference on Logical Structures, P&I-I, FRG. Elliott, M. & Famham, A. W. ef al. (1974). Natwe 248,

710.

Owen, J. D. (1975), J. Chem. Sot. Per/& I 186.5. Tosi, C. & Lipari, G. (1981), Thee-et. Chim. Acfo (Berlin)

60. 41 Verl&p, A., Hoogenstraaten, W. & Tipker, J. (I976), in

MedicinaI Chemistry, Vol. VII, Chap. 4, E. J. AriBns, Ed., Academic Press, New York.

McComick, G. P. (1972), in Numerical Methods for Non-Linear Optimization, F. A. Lotsma, Ed., Academic Press, London, p. 209.

Lewis, P. N., Momany, F. A. & Scheraga, H. A. (1973), ISraP J. Chem. 11, 121.

Barry, C. D., North, A. C. T’., Glasel, J. A., Williams, R. J. P. & Xavier, A. V. (1971), Nolure 232, p. 236 et seq.

Barry, C. D., Bosshard, H. E., Ellis. R. A. & Marshall, G. R. (1974), Fed. Proc. 33(12).

Scheraga, H. A., (1971), Chem. Rev. 71, 195.