- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index
Internal Coordinate Simulation |
119 |
approximation tends to underestimate the relative weight of torsion fluctuations. In general, because of the sampling limitations, conformational entropies cannot be accurately evaluated, and it remains unclear how large their relative contribution is to the conformational stability of biopolymers, let alone the specific bond angle part.
(A5) In calculations, freezing of bond angles noticeably reduces the rate of torsion transitions [12,21], which sometimes is also attributed to the bond angle contribution to the transition state entropy.
(B5) Generally, in molecular modeling, such discrepancies should be corrected by adjusting force field parameters. One should note, however, that this specific effect has more than one interpretation. The bond angle motion contributes entropy both to saddle points and to minimum energy states, and, in principle, it can affect reaction rates both ways. It is known, on the other hand, that the rate of barrier crossings in classical MD is always overestimated because of the so-called zero point energy problem [24,25]. The ‘‘zero point energy’’ is the ground-level energy of molecular vibrations. In reality it is unexchangeable, but in classical dynamics it serves as a vast, easily accessible energy reservoir for barrier-crossing fluctuations. The imposition of constraints is, in fact, one of the ways to correct for this deficiency.
The foregoing polemic is of significant theoretical interest, but, in my opinion, none of these arguments can justify or disqualify this or that specific technique. For the latter purpose, computed results must be compared with experimental data, and this cannot be replaced by calculations with and without constraints. Because of the severe sampling limitations, the experimental basis for such comparisons is extremely narrow, and suitable experimental data can instead be used for fitting force field parameters. At present, and into the foreseeable future, the sampling power of MC and MD methods remains the key limiting factor in all biological applications, especially in ‘‘true simulations.’’ Compared to other means commonly used to improve the sampling, such as high temperature, the standard geometry constraints propose a relatively harmless remedy anyway.
III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
Every phase of internal coordinate modeling admits many methodological variations, and I do not attempt to review them all. I outline only the standard problems encountered in any particular domain of application and the common practical solutions.
A.Selection of Variables
First of all, one needs to choose the local coordinate frame of a molecule and position it in space. Figure 2a shows the global coordinate frame xyz and the local frame x′y′z′ bound with the molecule. The origin of the local frame coincides with the first atom. Its three Cartesian coordinates are included in the whole set and are varied directly by integrators and minimizers, like any other independent variable. The angular orientation of the local frame is determined by a quaternion. The principles of application of quaternions in mechanics are beyond this book; they are explained in detail in well-known standard texts
120 |
Mazur |
Figure 2 Illustration of the definition of internal coordinates. Main chain atoms are shown as filled circles.
[26,27]. We note only that a rigid-body quaternion involves four components bound by a normalization condition, which can be viewed as polar coordinates on a four-dimensional sphere isomorphous to the group of rigid-body rotations. Physically, the rotational motion involves three degrees of freedom and is completely determined by three Cartesian components of the angular velocity. The derivatives of the four quaternion components are related to the angular velocity by standard differential relations called kinematics equations. The latter are used both in dynamics and in energy minimization to compute variations of quaternions corresponding to infinitesimal rigid-body rotations.
The second atom always rests on axis O′x′, and the third atom always rests in plane x′O′y′. The second atom moves along the O′x′ axis if bond length 1-2 is not fixed. Similarly, the third atom moves in the x′O′y′ plane if bond 2-3 and/or bond angle 1-2-3 are variable. With these conventions, the six rigid-body degrees of freedom of frame x′y′z′ always complement the internal degrees of freedom in the molecule to the full set required. The fourth and the following atoms are added one by one, following a certain predefined order. At each step, one bond length, one planar angle, and one dihedral angle are used to position an atom. Note that the number of bonds available equals the number of atoms only for tree topologies; the number of bond angles and dihedrals is much larger. There is, therefore, considerable freedom in both atom ordering and the selection of bonds and angles employed as independent variables. In Figure 2b atoms at branches are assigned consecutive numbers starting from the main chain direction. For example, atom 6 is positioned with dihedral 1-2-3-6, angle 2-3-6, and bond 3-6. The corresponding triples for atoms 7 and 8 are (3-6-2-7, 6-3-7, 3-7) and (3-7-6-8, 7-3-8, 3-8). Only these bonds and angles are included in the set of independent internal coordinates.
In Figure 2b the main chain branch is positioned with true torsion, while the other branches employ so-called improper torsions. The utility of this seemingly strange choice becomes clear when the standard geometry approximation is required. In this case bonds, valence angles, and ‘‘improper torsions’’ are fixed, and the molecule is divided into rigid bodies. With the above definition, atoms can be sorted by using the following simple algorithm. Atoms are checked one by one according to their ordering, and if one of the internal coordinates of a given atom is not fixed it forms a new rigid body. If that is not the case, the atom is assigned to the rigid body of the lower branching point or to that of the previous atom, depending on whether it belongs to the main chain or to a secondary branch. This rule is easy to program and gives the standard geometry representation automatically regardless of the chemical nature of the molecule. In addition, it can produce
Internal Coordinate Simulation |
121 |
reasonable models with some bond angles free, which is necessary for nucleic acids, for instance. Figure 2c shows the standard geometry rigid bodies for this specific example.
B. Energy Gradients
As we see, internal coordinates are always defined together with the transformation rules that tell how atomic Cartesian coordinates must be computed. This construction determines also how the structure changes in response to variation of internal coordinates, and it is easy to see that the infinitesimal displacements of atoms correspond to the motion of a tree. Namely, the first three atoms form the base, internal coordinates close to the base move almost the whole molecule, whereas variables in a specific branch move only the upper part of that branch. For example, when torsion 1-2-3-6 in Figure 2b is varied then, by definition, the first three atoms are not affected, whereas the fourth and the following atoms move together. Let us introduce the term ‘‘articulated body’’ to refer to the set of atoms that move when variable qi is varied and denote it Di. If qi is a translational variable, like a bond length or a Cartesian coordinate of the first atom, the articulated body is translated along a fixed direction, and it is an easy matter to show that the partial derivative of the potential energy, ∂U/∂qi, equals the component of the total force applied to Di along this direction. Therefore it can be computed as
|
∂U |
ei fα fi |
(1) |
∂qi |
|||
|
|
α Di |
|
where ei is the unit vector of the translation, fα are forces applied to atoms, and fi is the possible additional force due to the bond stretching potential. Similarly, when qi is a rotational variable, such as a torsion, a valence angle or one of the rigid body rotations of the base, the articulated body rotates around a certain fixed axis and ∂U/∂qi is equal to the projection upon it of the total torque applied to Di. It is computed as
|
∂U |
ei rα fα eiri fα ti |
(2) |
|
∂qi |
||||
|
|
α Di |
α Di |
|
where ei is the unit vector of the rotation axis, ri is its position vector, rα are atom position vectors, and ti is the possible additional torque due to the torsion or bond angle bending potential.
When the energy gradient is evaluated by a computer the calculation of atomic forces fα produced by bonded and nonbonded interactions takes a vast majority of time. Summations involved in Eqs. (1) and (2) are rapid because the articulated bodies fit one into another like in a Russian doll. In an unbranched chain D1 D2 Dn 1 Dn, and therefore the sums in Eqs. (1) and (2) can be computed starting from the tip of the chain, moving to the base, and at the ith variable adding only the contribution from subset Di/Di 1. Extension to trees is straightforward. The idea of such recurrent calculations belongs to Go and collaborators [28], and it appears in many other applications. As a result, in terms of computer time, energy gradients with respect to internal coordinates are no more expensive than atomic forces in Cartesian coordinate calculations. In fact, even a small savings can be achieved, because the last terms in Eqs. (1) and (2) can be evaluated directly.
122 |
Mazur |
IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
A. Main Problems and Historical Perspective
Molecular dynamics has emerged as an application to molecules of the general method of point particles, with Cartesian coordinates and Newton’s equations, and it was first applied to flexible polymeric molecules more than 20 years ago [29,26,30]. It was already clear at that time that harmonic potentials that keep bond lengths and bond angles close to their standard values severely limited the time step and that it would be desirable to get rid of these ‘‘uninteresting’’ degrees of freedom. No wonder, therefore, that early attempts to apply internal coordinates with the standard geometry approximation in MD were made at the same time [19,31]. This way, however, appeared too complicated and has been abandoned in favor of an alternative approach proposed by Ryckaert et al. [32], which consists in imposing holonomic distance constraints upon a system of point particles governed by Newton’s equations. Their method, now called constraint dynamics, is reviewed elsewhere in this book. However, although it seemed initially that not only bond lengths but also bond angles, dihedral angles, and larger rigid groups could be fixed by using triangulation, this was found to be true only for very small molecules. In large complex polymers such as proteins, even bond angles cannot be fixed in this way [12].
The intrinsic limitation of constraint dynamics can be qualitatively understood from the underlying physical model. Imposing a distance constraint implies that a reaction force is introduced that is applied along a line joining two atoms. Reactions must be calculated at each time step to balance all other forces in the system, which for large branched molecules presents a difficult problem because they are all coupled and form a system of algebraic equations solved by iterations. With only bond lengths to hydrogens fixed, reactions are coupled in small groups, and iterations converge rapidly. With all bond lengths fixed, reactions are coupled globally, and looping becomes possible. In general, in this case the convergence is not guaranteed [33], but in practice it remains acceptable if reactions at the same atom are applied at obtuse angles. The last condition breaks down when bond angles are triangulated, and as a result the convergence becomes too slow.
These difficulties have led to a revival of work on internal coordinate approaches, and to date several such techniques have been reported based on methods of rigid-body dynamics [8,19,34–37] and the Lagrange–Hamilton formalism [38–42]. These methods often have little in common in their analytical formulations, but they all may be reasonably referred to as internal coordinate molecular dynamics (ICMD) to underline their main distinction from conventional MD: They all consider molecular motion in the space of generalized internal coordinates rather than in the usual Cartesian coordinate space. Their main goal is to compute long-duration macromolecular trajectories with acceptable accuracy but at a lower cost than Cartesian coordinate MD with bond length constraints. This task turned out to be more complicated than it seemed initially.
Two problems inherent in ICMD were clear from the beginning. The first is the derivation of equations of motion for large molecules, which was the main subject of the initial reports [19,34,38,39]. The second problem is the cost of additional calculations, which must be low enough to be compensated for by an increase in the step size. The need to invert the mass matrix to obtain generalized accelerations was the main obstacle. Generally, it is a full positive definite matrix; therefore, a direct inversion scales as O(n3) with the number of degrees of freedom and quickly becomes impractical when n exceeds 100. It turned out, fortunately, that this problem had been earlier solved in robot mechanics [43–46]. There are several recursive O(n) algorithms that can rapidly compute exact gener-