Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
68
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

Internal Coordinate Simulation

119

approximation tends to underestimate the relative weight of torsion fluctuations. In general, because of the sampling limitations, conformational entropies cannot be accurately evaluated, and it remains unclear how large their relative contribution is to the conformational stability of biopolymers, let alone the specific bond angle part.

(A5) In calculations, freezing of bond angles noticeably reduces the rate of torsion transitions [12,21], which sometimes is also attributed to the bond angle contribution to the transition state entropy.

(B5) Generally, in molecular modeling, such discrepancies should be corrected by adjusting force field parameters. One should note, however, that this specific effect has more than one interpretation. The bond angle motion contributes entropy both to saddle points and to minimum energy states, and, in principle, it can affect reaction rates both ways. It is known, on the other hand, that the rate of barrier crossings in classical MD is always overestimated because of the so-called zero point energy problem [24,25]. The ‘‘zero point energy’’ is the ground-level energy of molecular vibrations. In reality it is unexchangeable, but in classical dynamics it serves as a vast, easily accessible energy reservoir for barrier-crossing fluctuations. The imposition of constraints is, in fact, one of the ways to correct for this deficiency.

The foregoing polemic is of significant theoretical interest, but, in my opinion, none of these arguments can justify or disqualify this or that specific technique. For the latter purpose, computed results must be compared with experimental data, and this cannot be replaced by calculations with and without constraints. Because of the severe sampling limitations, the experimental basis for such comparisons is extremely narrow, and suitable experimental data can instead be used for fitting force field parameters. At present, and into the foreseeable future, the sampling power of MC and MD methods remains the key limiting factor in all biological applications, especially in ‘‘true simulations.’’ Compared to other means commonly used to improve the sampling, such as high temperature, the standard geometry constraints propose a relatively harmless remedy anyway.

III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES

Every phase of internal coordinate modeling admits many methodological variations, and I do not attempt to review them all. I outline only the standard problems encountered in any particular domain of application and the common practical solutions.

A.Selection of Variables

First of all, one needs to choose the local coordinate frame of a molecule and position it in space. Figure 2a shows the global coordinate frame xyz and the local frame xyzbound with the molecule. The origin of the local frame coincides with the first atom. Its three Cartesian coordinates are included in the whole set and are varied directly by integrators and minimizers, like any other independent variable. The angular orientation of the local frame is determined by a quaternion. The principles of application of quaternions in mechanics are beyond this book; they are explained in detail in well-known standard texts

120

Mazur

Figure 2 Illustration of the definition of internal coordinates. Main chain atoms are shown as filled circles.

[26,27]. We note only that a rigid-body quaternion involves four components bound by a normalization condition, which can be viewed as polar coordinates on a four-dimensional sphere isomorphous to the group of rigid-body rotations. Physically, the rotational motion involves three degrees of freedom and is completely determined by three Cartesian components of the angular velocity. The derivatives of the four quaternion components are related to the angular velocity by standard differential relations called kinematics equations. The latter are used both in dynamics and in energy minimization to compute variations of quaternions corresponding to infinitesimal rigid-body rotations.

The second atom always rests on axis Ox, and the third atom always rests in plane xOy. The second atom moves along the Oxaxis if bond length 1-2 is not fixed. Similarly, the third atom moves in the xOyplane if bond 2-3 and/or bond angle 1-2-3 are variable. With these conventions, the six rigid-body degrees of freedom of frame xyzalways complement the internal degrees of freedom in the molecule to the full set required. The fourth and the following atoms are added one by one, following a certain predefined order. At each step, one bond length, one planar angle, and one dihedral angle are used to position an atom. Note that the number of bonds available equals the number of atoms only for tree topologies; the number of bond angles and dihedrals is much larger. There is, therefore, considerable freedom in both atom ordering and the selection of bonds and angles employed as independent variables. In Figure 2b atoms at branches are assigned consecutive numbers starting from the main chain direction. For example, atom 6 is positioned with dihedral 1-2-3-6, angle 2-3-6, and bond 3-6. The corresponding triples for atoms 7 and 8 are (3-6-2-7, 6-3-7, 3-7) and (3-7-6-8, 7-3-8, 3-8). Only these bonds and angles are included in the set of independent internal coordinates.

In Figure 2b the main chain branch is positioned with true torsion, while the other branches employ so-called improper torsions. The utility of this seemingly strange choice becomes clear when the standard geometry approximation is required. In this case bonds, valence angles, and ‘‘improper torsions’’ are fixed, and the molecule is divided into rigid bodies. With the above definition, atoms can be sorted by using the following simple algorithm. Atoms are checked one by one according to their ordering, and if one of the internal coordinates of a given atom is not fixed it forms a new rigid body. If that is not the case, the atom is assigned to the rigid body of the lower branching point or to that of the previous atom, depending on whether it belongs to the main chain or to a secondary branch. This rule is easy to program and gives the standard geometry representation automatically regardless of the chemical nature of the molecule. In addition, it can produce

Internal Coordinate Simulation

121

reasonable models with some bond angles free, which is necessary for nucleic acids, for instance. Figure 2c shows the standard geometry rigid bodies for this specific example.

B. Energy Gradients

As we see, internal coordinates are always defined together with the transformation rules that tell how atomic Cartesian coordinates must be computed. This construction determines also how the structure changes in response to variation of internal coordinates, and it is easy to see that the infinitesimal displacements of atoms correspond to the motion of a tree. Namely, the first three atoms form the base, internal coordinates close to the base move almost the whole molecule, whereas variables in a specific branch move only the upper part of that branch. For example, when torsion 1-2-3-6 in Figure 2b is varied then, by definition, the first three atoms are not affected, whereas the fourth and the following atoms move together. Let us introduce the term ‘‘articulated body’’ to refer to the set of atoms that move when variable qi is varied and denote it Di. If qi is a translational variable, like a bond length or a Cartesian coordinate of the first atom, the articulated body is translated along a fixed direction, and it is an easy matter to show that the partial derivative of the potential energy, ∂U/∂qi, equals the component of the total force applied to Di along this direction. Therefore it can be computed as

 

U

ei fα fi

(1)

qi

 

 

α Di

 

where ei is the unit vector of the translation, fα are forces applied to atoms, and fi is the possible additional force due to the bond stretching potential. Similarly, when qi is a rotational variable, such as a torsion, a valence angle or one of the rigid body rotations of the base, the articulated body rotates around a certain fixed axis and ∂U/∂qi is equal to the projection upon it of the total torque applied to Di. It is computed as

 

U

ei rα fα eiri fα ti

(2)

qi

 

 

α Di

α Di

 

where ei is the unit vector of the rotation axis, ri is its position vector, rα are atom position vectors, and ti is the possible additional torque due to the torsion or bond angle bending potential.

When the energy gradient is evaluated by a computer the calculation of atomic forces fα produced by bonded and nonbonded interactions takes a vast majority of time. Summations involved in Eqs. (1) and (2) are rapid because the articulated bodies fit one into another like in a Russian doll. In an unbranched chain D1 D2 Dn 1 Dn, and therefore the sums in Eqs. (1) and (2) can be computed starting from the tip of the chain, moving to the base, and at the ith variable adding only the contribution from subset Di/Di 1. Extension to trees is straightforward. The idea of such recurrent calculations belongs to Go and collaborators [28], and it appears in many other applications. As a result, in terms of computer time, energy gradients with respect to internal coordinates are no more expensive than atomic forces in Cartesian coordinate calculations. In fact, even a small savings can be achieved, because the last terms in Eqs. (1) and (2) can be evaluated directly.

122

Mazur

IV. INTERNAL COORDINATE MOLECULAR DYNAMICS

A. Main Problems and Historical Perspective

Molecular dynamics has emerged as an application to molecules of the general method of point particles, with Cartesian coordinates and Newton’s equations, and it was first applied to flexible polymeric molecules more than 20 years ago [29,26,30]. It was already clear at that time that harmonic potentials that keep bond lengths and bond angles close to their standard values severely limited the time step and that it would be desirable to get rid of these ‘‘uninteresting’’ degrees of freedom. No wonder, therefore, that early attempts to apply internal coordinates with the standard geometry approximation in MD were made at the same time [19,31]. This way, however, appeared too complicated and has been abandoned in favor of an alternative approach proposed by Ryckaert et al. [32], which consists in imposing holonomic distance constraints upon a system of point particles governed by Newton’s equations. Their method, now called constraint dynamics, is reviewed elsewhere in this book. However, although it seemed initially that not only bond lengths but also bond angles, dihedral angles, and larger rigid groups could be fixed by using triangulation, this was found to be true only for very small molecules. In large complex polymers such as proteins, even bond angles cannot be fixed in this way [12].

The intrinsic limitation of constraint dynamics can be qualitatively understood from the underlying physical model. Imposing a distance constraint implies that a reaction force is introduced that is applied along a line joining two atoms. Reactions must be calculated at each time step to balance all other forces in the system, which for large branched molecules presents a difficult problem because they are all coupled and form a system of algebraic equations solved by iterations. With only bond lengths to hydrogens fixed, reactions are coupled in small groups, and iterations converge rapidly. With all bond lengths fixed, reactions are coupled globally, and looping becomes possible. In general, in this case the convergence is not guaranteed [33], but in practice it remains acceptable if reactions at the same atom are applied at obtuse angles. The last condition breaks down when bond angles are triangulated, and as a result the convergence becomes too slow.

These difficulties have led to a revival of work on internal coordinate approaches, and to date several such techniques have been reported based on methods of rigid-body dynamics [8,19,34–37] and the Lagrange–Hamilton formalism [38–42]. These methods often have little in common in their analytical formulations, but they all may be reasonably referred to as internal coordinate molecular dynamics (ICMD) to underline their main distinction from conventional MD: They all consider molecular motion in the space of generalized internal coordinates rather than in the usual Cartesian coordinate space. Their main goal is to compute long-duration macromolecular trajectories with acceptable accuracy but at a lower cost than Cartesian coordinate MD with bond length constraints. This task turned out to be more complicated than it seemed initially.

Two problems inherent in ICMD were clear from the beginning. The first is the derivation of equations of motion for large molecules, which was the main subject of the initial reports [19,34,38,39]. The second problem is the cost of additional calculations, which must be low enough to be compensated for by an increase in the step size. The need to invert the mass matrix to obtain generalized accelerations was the main obstacle. Generally, it is a full positive definite matrix; therefore, a direct inversion scales as O(n3) with the number of degrees of freedom and quickly becomes impractical when n exceeds 100. It turned out, fortunately, that this problem had been earlier solved in robot mechanics [43–46]. There are several recursive O(n) algorithms that can rapidly compute exact gener-