- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index
4
Conformational Analysis
Oren M. Becker
Tel Aviv University, Tel Aviv, Israel
I.BACKGROUND
The goal of conformational analysis is to shed light on conformational characteristics of flexible biomolecules and to gain insight into the relationship between their flexibility and their function. Because of the importance of this approach, conformational analysis plays a role in many computational projects ranging from computer-aided drug design to the analysis of molecular dynamics simulations and protein folding. In fact, most structurebased drug design projects today use conformational analysis techniques as part of their toolchest. As will be discussed in Chapter 16, in structure-based drug design a rational effort is applied to identifying potential drug molecules that bind favorably into a known three-dimensional (3D) binding site [1], the structure of which was determined through X-ray crystallography, NMR spectroscopy, or computer modeling. Because such an effort requires, among other things, structural compatibility between the drug candidate and the binding site, computational methods were developed to ‘‘dock’’ ligands into binding sites [2]. These docking calculations are used for screening large virtual molecular libraries, saving both time and money. However, although docking is fairly straightforward with rigid molecules, it becomes significantly more complicated when flexible molecules are considered. This is because flexible molecules can adopt many different conformations, each of which may, in principle, lead to successful docking.
Although there are a few ‘‘flexible docking’’ approaches that account for flexibility during the docking process itself, most docking applications rely on conformational analysis to deal with this problem (e.g., by generating a multitude of molecular conformations that are docked separately into the binding site). The importance of conformational analysis in the context of drug design extends beyond computational docking and screening. Conformational analysis is a major tool used to gain insight for future lead optimization. Furthermore, even when the 3D structure of the binding site is unknown, conformational analysis can yield insights into the structural characteristics of various drug candidates.
In a different context, conformational analysis is essential for the analysis of molecular dynamics simulations. As discussed in Chapter 3, the direct output of a molecular dynamics simulation is a set of conformations (‘‘snapshots’’) that were saved along the trajectory. These conformations are subsequently analyzed in order to extract information about the system. However, if, during a long simulation, the molecule moves from one
69
70 |
Becker |
conformation class to another, averaging over the whole simulation is likely to be misleading. Conformational analysis allows one to first identify whether such drastic conformational transitions have occurred and then to focus the analysis on one group of conformations at a time.
In view of their importance it is not surprising that conformation sampling and analysis constitute a very active and innovative field of research that is relevant to biomolecules and inorganic molecular clusters alike. The following sections offer an introduction to the main methodologies that are used as part of a conformational analysis study. These are arranged according to the three main steps applied in such studies: (1) conformation sampling, (2) conformation optimization, and (3) conformational analysis.
II. CONFORMATION SAMPLING
Conformation sampling is a process used to generate the collection of molecular conformations that will later be analyzed. Ideally, all locally stable conformations of the molecule should be accounted for in order for the conformational analysis to be complete. However, owing to the complexity of proteins and even fairly small peptides it is impractical to perform such an enumeration (see Section II.D.1). The number of locally stable conformations increases so fast with the molecular size that the task of full enumeration becomes formidable. Even enumerating all possible {φ, ψ} conformations of a protein backbone rapidly becomes intractable. As a result, most conformational studies must rely on sampling techniques. The basic requirement from such sampling procedures is that the resulting conformational sample (‘‘ensemble’’) will be representative of the system as a whole. This means that in most biomolecular studies a ‘‘canonical’’ ensemble, characterized by a constant temperature (see Chapter 3), is sought. Therefore sampling methods that were designed for canonical ensembles and that guarantee ‘‘detailed balance’’ are especially suitable for this task. Two such methods are high temperature molecular dynamics and Monte Carlo simulations. However, because of the complexity and volume of biomolecular conformational space, other sampling techniques, which do not adhere to the canonical ensemble constraint, are also often employed.
A. High Temperature Molecular Dynamics
Molecular dynamics simulations, which were discussed in Chapter 3, are among the most useful methods for sampling molecular conformational space. As the simulation proceeds, the classical trajectory that is traced is in fact a subset of the molecular conformations available to the molecule at that energy (for microcanonical simulations) or temperature (for canonical simulations). Assuming that the ergodic hypothesis holds (see Chapter 3), an infinitely long MD trajectory will cover all of conformational space. The problem with room temperature MD simulations is that a shorter finite-time trajectory is not likely to sample all of conformational space. Even a nanosecond MD trajectory will most likely be confined to limited regions of conformational space (Fig. 1a). The room temperature probability of crossing high energy barriers is often too small to be observed during a finite MD simulation.
A common solution that allows one to overcome the limited sampling by MD simulations at room temperature is simply to raise the temperature of the simulation. The additional kinetic energy available in a higher temperature simulation makes crossing high
Conformational Analysis |
71 |
Figure 1 A schematic view of (a) a low temperature simulation that is confined by high energy barriers to a small region of the energy landscape and (b) a high temperature simulation that can overcome those barriers and sample a larger portion of conformational space.
energy barriers more likely and ensures a broad sampling of conformational space. In raising the simulation temperature to 1000 K or more, one takes advantage of the fact that chemical bonds cannot break in most biomolecular force fields (Chapter 2). Namely, the fact that bonds are modeled by a harmonic potential means that regardless of the simulation temperature these bonds can never spontaneously break, and the chemical integrity of the molecule remains intact. The effect of the unrealistically high temperatures employed is primarily to ‘‘shake’’ the system and allow the molecule to cross high energy barriers (Fig. 1b).
There is no definite rule regarding what temperature is ‘‘high temperature’’ in this context, as this depends on the character of the underlying energy landscape. Temperatures on the order of 1000 K are often used for sampling the conformations of peptides and proteins, because this temperature is below the temperature at which unwanted cis–trans transitions of the peptide bond frequently occur [3]. In other cases, such as for sampling the conformations of a ligand bound in a protein’s active site, much lower temperatures must be used. Otherwise the ligand will dissociate and the simulation will sample the conformations of an unbound ligand rather than those of the bound ligand.
The main advantage of using MD for conformation sampling is that information of molecular forces is used to guide the search process into meaningful regions of the potential. A disadvantage associated with this sampling technique is the fact that high temperature simulations sample not only the regions of interest at room temperature but also regions that are inaccessible to the molecule at room temperature. To overcome this problem the sampled conformations have to be energy-minimized or preferably annealed before being considered as sampled conformations. These methods will be discussed in Section III.
B. Monte Carlo Simulations
Monte Carlo search methods are stochastic techniques based on the use of random numbers and probability statistics to sample conformational space. The name ‘‘Monte Carlo’’ was originally coined by Metropolis and Ulam [4] during the Manhattan Project of World War II because of the similarity of this simulation technique to games of chance. Today a variety of Monte Carlo (MC) simulation methods are routinely used in diverse fields such as atmospheric studies, nuclear physics, traffic flow, and, of course, biochemistry and biophysics. In this section we focus on the application of the Monte Carlo method for
72 |
Becker |
conformational searching. More detailed in-depth accounts of these methods can be found in Refs. 5 and 6.
In performing a Monte Carlo sampling procedure we let the dice decide, again and again, how to proceed with the search process. In general, a Monte Carlo search consists of two steps: (1) generating a new ‘‘trial conformation’’ and (2) deciding whether the new conformation will be accepted or rejected.
Starting from any given conformation we ‘‘roll the dice,’’ i.e., we let the computer choose random numbers, to decide what will be the next trial conformation. The precise details of how these moves are constructed vary from one study to another, but most share similar traits. For example, assuming that the search proceeds via polypeptide torsion moves, choosing a new trial conformation could include the following steps. First, roll the dice to randomly select an amino acid position along the polypeptide backbone. Second, randomly select which of the several rotatable bonds in that amino acid will be modified (e.g., the φ, ψ, or χ torsion angles). Finally, randomly select a new value for this torsion angle from a predefined set of values. In this example it took three separate random selections to generate a new trial conformation. Multiple torsion moves as well as Cartesian coordinate moves are among the many possible variations on this procedure.
Once a new ‘‘trial conformation’’ is created, it is necessary to determine whether this conformation will be accepted or rejected. If rejected, the above procedure will be repeated, randomly creating new trial conformations until one of them is accepted. If accepted, the new conformation becomes the ‘‘current’’ conformation, and the search process continues from it. The trial conformation is usually accepted or rejected according to a temperature-dependent probability of the Metropolis type,
p |
e β∆U, |
e β∆U <1 |
or |
p min[1,e β∆U] |
(1) |
1, |
e β∆U 1 |
where β 1/kT and ∆U is the change in the potential energy. This means that if the energy of the new trial conformation is lower than that of the current conformation, ∆U0, it is always accepted. But even if the energy of the trial conformation is higher than the current energy, ∆U 0, there is a certain probability, proportional to the Boltzmann factor, that it will be accepted. To find out whether a higher energy trial conformation is accepted, a random number r in the range [0, 1] is selected and compared to the Metropolis probability defined in Eq. (1). If r p, the conformation is accepted; otherwise it is rejected. This acceptance probability satisfies the principle of detailed balance, ensuring that if the process continues for a long enough time then a stationary solution will be achieved.
In Monte Carlo simulations, just as in MD simulations, temperature plays an important role. In general, MC simulations tend to move toward low energy states. However, at high temperatures (small β values) there is a significant probability of climbing up energy slopes, allowing the search process to cross high energy barriers. This probability becomes significantly smaller at low temperatures, and it vanishes altogether in the limit of T → 0, where the method becomes equivalent to a minimization process. Thus, high temperature MC is often used to sample broad regions of conformational space.
As stated above, MC simulations are popular in many diverse fields. Their popularity is due mainly to their ease of use and their good convergence properties. Nonetheless, straightforward and application of MC methods to biomolecules is often problematic due