Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
68
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

4

Conformational Analysis

Oren M. Becker

Tel Aviv University, Tel Aviv, Israel

I.BACKGROUND

The goal of conformational analysis is to shed light on conformational characteristics of flexible biomolecules and to gain insight into the relationship between their flexibility and their function. Because of the importance of this approach, conformational analysis plays a role in many computational projects ranging from computer-aided drug design to the analysis of molecular dynamics simulations and protein folding. In fact, most structurebased drug design projects today use conformational analysis techniques as part of their toolchest. As will be discussed in Chapter 16, in structure-based drug design a rational effort is applied to identifying potential drug molecules that bind favorably into a known three-dimensional (3D) binding site [1], the structure of which was determined through X-ray crystallography, NMR spectroscopy, or computer modeling. Because such an effort requires, among other things, structural compatibility between the drug candidate and the binding site, computational methods were developed to ‘‘dock’’ ligands into binding sites [2]. These docking calculations are used for screening large virtual molecular libraries, saving both time and money. However, although docking is fairly straightforward with rigid molecules, it becomes significantly more complicated when flexible molecules are considered. This is because flexible molecules can adopt many different conformations, each of which may, in principle, lead to successful docking.

Although there are a few ‘‘flexible docking’’ approaches that account for flexibility during the docking process itself, most docking applications rely on conformational analysis to deal with this problem (e.g., by generating a multitude of molecular conformations that are docked separately into the binding site). The importance of conformational analysis in the context of drug design extends beyond computational docking and screening. Conformational analysis is a major tool used to gain insight for future lead optimization. Furthermore, even when the 3D structure of the binding site is unknown, conformational analysis can yield insights into the structural characteristics of various drug candidates.

In a different context, conformational analysis is essential for the analysis of molecular dynamics simulations. As discussed in Chapter 3, the direct output of a molecular dynamics simulation is a set of conformations (‘‘snapshots’’) that were saved along the trajectory. These conformations are subsequently analyzed in order to extract information about the system. However, if, during a long simulation, the molecule moves from one

69

70

Becker

conformation class to another, averaging over the whole simulation is likely to be misleading. Conformational analysis allows one to first identify whether such drastic conformational transitions have occurred and then to focus the analysis on one group of conformations at a time.

In view of their importance it is not surprising that conformation sampling and analysis constitute a very active and innovative field of research that is relevant to biomolecules and inorganic molecular clusters alike. The following sections offer an introduction to the main methodologies that are used as part of a conformational analysis study. These are arranged according to the three main steps applied in such studies: (1) conformation sampling, (2) conformation optimization, and (3) conformational analysis.

II. CONFORMATION SAMPLING

Conformation sampling is a process used to generate the collection of molecular conformations that will later be analyzed. Ideally, all locally stable conformations of the molecule should be accounted for in order for the conformational analysis to be complete. However, owing to the complexity of proteins and even fairly small peptides it is impractical to perform such an enumeration (see Section II.D.1). The number of locally stable conformations increases so fast with the molecular size that the task of full enumeration becomes formidable. Even enumerating all possible , ψ} conformations of a protein backbone rapidly becomes intractable. As a result, most conformational studies must rely on sampling techniques. The basic requirement from such sampling procedures is that the resulting conformational sample (‘‘ensemble’’) will be representative of the system as a whole. This means that in most biomolecular studies a ‘‘canonical’’ ensemble, characterized by a constant temperature (see Chapter 3), is sought. Therefore sampling methods that were designed for canonical ensembles and that guarantee ‘‘detailed balance’’ are especially suitable for this task. Two such methods are high temperature molecular dynamics and Monte Carlo simulations. However, because of the complexity and volume of biomolecular conformational space, other sampling techniques, which do not adhere to the canonical ensemble constraint, are also often employed.

A. High Temperature Molecular Dynamics

Molecular dynamics simulations, which were discussed in Chapter 3, are among the most useful methods for sampling molecular conformational space. As the simulation proceeds, the classical trajectory that is traced is in fact a subset of the molecular conformations available to the molecule at that energy (for microcanonical simulations) or temperature (for canonical simulations). Assuming that the ergodic hypothesis holds (see Chapter 3), an infinitely long MD trajectory will cover all of conformational space. The problem with room temperature MD simulations is that a shorter finite-time trajectory is not likely to sample all of conformational space. Even a nanosecond MD trajectory will most likely be confined to limited regions of conformational space (Fig. 1a). The room temperature probability of crossing high energy barriers is often too small to be observed during a finite MD simulation.

A common solution that allows one to overcome the limited sampling by MD simulations at room temperature is simply to raise the temperature of the simulation. The additional kinetic energy available in a higher temperature simulation makes crossing high

Conformational Analysis

71

Figure 1 A schematic view of (a) a low temperature simulation that is confined by high energy barriers to a small region of the energy landscape and (b) a high temperature simulation that can overcome those barriers and sample a larger portion of conformational space.

energy barriers more likely and ensures a broad sampling of conformational space. In raising the simulation temperature to 1000 K or more, one takes advantage of the fact that chemical bonds cannot break in most biomolecular force fields (Chapter 2). Namely, the fact that bonds are modeled by a harmonic potential means that regardless of the simulation temperature these bonds can never spontaneously break, and the chemical integrity of the molecule remains intact. The effect of the unrealistically high temperatures employed is primarily to ‘‘shake’’ the system and allow the molecule to cross high energy barriers (Fig. 1b).

There is no definite rule regarding what temperature is ‘‘high temperature’’ in this context, as this depends on the character of the underlying energy landscape. Temperatures on the order of 1000 K are often used for sampling the conformations of peptides and proteins, because this temperature is below the temperature at which unwanted cis–trans transitions of the peptide bond frequently occur [3]. In other cases, such as for sampling the conformations of a ligand bound in a protein’s active site, much lower temperatures must be used. Otherwise the ligand will dissociate and the simulation will sample the conformations of an unbound ligand rather than those of the bound ligand.

The main advantage of using MD for conformation sampling is that information of molecular forces is used to guide the search process into meaningful regions of the potential. A disadvantage associated with this sampling technique is the fact that high temperature simulations sample not only the regions of interest at room temperature but also regions that are inaccessible to the molecule at room temperature. To overcome this problem the sampled conformations have to be energy-minimized or preferably annealed before being considered as sampled conformations. These methods will be discussed in Section III.

B. Monte Carlo Simulations

Monte Carlo search methods are stochastic techniques based on the use of random numbers and probability statistics to sample conformational space. The name ‘‘Monte Carlo’’ was originally coined by Metropolis and Ulam [4] during the Manhattan Project of World War II because of the similarity of this simulation technique to games of chance. Today a variety of Monte Carlo (MC) simulation methods are routinely used in diverse fields such as atmospheric studies, nuclear physics, traffic flow, and, of course, biochemistry and biophysics. In this section we focus on the application of the Monte Carlo method for

72

Becker

conformational searching. More detailed in-depth accounts of these methods can be found in Refs. 5 and 6.

In performing a Monte Carlo sampling procedure we let the dice decide, again and again, how to proceed with the search process. In general, a Monte Carlo search consists of two steps: (1) generating a new ‘‘trial conformation’’ and (2) deciding whether the new conformation will be accepted or rejected.

Starting from any given conformation we ‘‘roll the dice,’’ i.e., we let the computer choose random numbers, to decide what will be the next trial conformation. The precise details of how these moves are constructed vary from one study to another, but most share similar traits. For example, assuming that the search proceeds via polypeptide torsion moves, choosing a new trial conformation could include the following steps. First, roll the dice to randomly select an amino acid position along the polypeptide backbone. Second, randomly select which of the several rotatable bonds in that amino acid will be modified (e.g., the φ, ψ, or χ torsion angles). Finally, randomly select a new value for this torsion angle from a predefined set of values. In this example it took three separate random selections to generate a new trial conformation. Multiple torsion moves as well as Cartesian coordinate moves are among the many possible variations on this procedure.

Once a new ‘‘trial conformation’’ is created, it is necessary to determine whether this conformation will be accepted or rejected. If rejected, the above procedure will be repeated, randomly creating new trial conformations until one of them is accepted. If accepted, the new conformation becomes the ‘‘current’’ conformation, and the search process continues from it. The trial conformation is usually accepted or rejected according to a temperature-dependent probability of the Metropolis type,

p

e β∆U,

e β∆U <1

or

p min[1,e β∆U]

(1)

1,

e β∆U 1

where β 1/kT and U is the change in the potential energy. This means that if the energy of the new trial conformation is lower than that of the current conformation, U0, it is always accepted. But even if the energy of the trial conformation is higher than the current energy, U 0, there is a certain probability, proportional to the Boltzmann factor, that it will be accepted. To find out whether a higher energy trial conformation is accepted, a random number r in the range [0, 1] is selected and compared to the Metropolis probability defined in Eq. (1). If r p, the conformation is accepted; otherwise it is rejected. This acceptance probability satisfies the principle of detailed balance, ensuring that if the process continues for a long enough time then a stationary solution will be achieved.

In Monte Carlo simulations, just as in MD simulations, temperature plays an important role. In general, MC simulations tend to move toward low energy states. However, at high temperatures (small β values) there is a significant probability of climbing up energy slopes, allowing the search process to cross high energy barriers. This probability becomes significantly smaller at low temperatures, and it vanishes altogether in the limit of T 0, where the method becomes equivalent to a minimization process. Thus, high temperature MC is often used to sample broad regions of conformational space.

As stated above, MC simulations are popular in many diverse fields. Their popularity is due mainly to their ease of use and their good convergence properties. Nonetheless, straightforward and application of MC methods to biomolecules is often problematic due