Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
68
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

16

Computer Aided Drug Design

Alexander Tropsha and Weifan Zheng

University of North Carolina at Chapel Hill, Chapel Hill, North Carolina

I.INTRODUCTION

Computer-aided drug design (CADD) has enjoyed a long and successful history in its development and applications. Even before the computer age, chemists and medicinal chemists used their intuition and experience to design molecules with desired biological profiles. However, as in practically all areas of chemistry, rapid accumulation of vast amounts of experimental information of biologically active molecules naturally led to the development of theoretical and computational tools for the rational analysis and design of bioactive molecules. The power of these tools rests solidly on a variety of well-established scientific disciplines including theoretical and experimental chemistry, biochemistry, biophysics, pharmacology, and computer science. This unique combination of underlying scientific disciplines makes CADD an indispensable complementary tool for most types of experimental research in drug design and development.

The advancement of drug design as a quantitative, analytical methodology should be, perhaps, attributed to Professor Corwin Hanch, who in the early 1960s initiated the development of the quantitative structure–activity relationships (QSAR) method (reviewed in Ref. 1). Early QSAR studies made extensive use of various physicochemical properties (such as the octanol–water distribution coefficient, which serves as a measure of hydrophobicity) and chemical topology based molecular descriptors to arrive at reliable models correlating structure and activity. Beginning in the early 1970s, CADD started to employ computer graphics or, more specifically, molecular graphics. This led to the development of three-dimensional molecular modeling, which can be defined as the generation, manipulation, calculation, and prediction of realistic chemical structures and associated physicochemical and biological properties [adapted from Ref. 2]. The application of molecular modeling approaches in the drug design area culminated in the development of the first three-dimensional (3D) pharmacophore assignment method, the active analog approach (AAA) [3,4]. This was quickly followed by the development of pharmacophorebased database searching methodologies and the establishment of 3D QSAR approaches such as comparative molecular field analysis (CoMFA) [5], which remains one of the most popular methods of CADD today. Finally, beginning in the early 1980s, a growing number of drug design applications started to use experimental structural information about macromolecular drug targets, i.e., proteins and nucleic acids.

351

352

Tropsha and Zheng

Modern methods of computer-aided drug design fall into two major categories: li- gand-based and receptor-based methods. The former methods, which include QSAR, various pharmacophore assignment methods, and database searching or mining, are based entirely on experimental structure–activity relationships for receptor ligands or enzyme inhibitors. Their application in the last three or four decades led to several drugs currently on the market (reviewed in Ref. 6). The structure-based design methods, which include docking and advanced molecular simulations, require structural information about the receptor or enzyme that is available from X-ray crystallography, nuclear magnetic resonance (NMR) techniques, or protein homology model building. This strategy became popular only recently owing to rapid advances in structure elucidation methods [7] and has already led to several promising drug candidates and even marketed drugs such as several approved inhibitors of the HIV protease for the treatment of AIDS [8].

It is practically impossible to review all major developments, applications, approaches, and caveats of the various drug design methods in one chapter. Several monographs and numerous publications have been devoted to CADD [9–11], and many specialized reviews addressing various aspects of molecular modeling as applied to drug design have been published in recent years in the ongoing series Reviews in Computational Chemistry [12]. We decided to concentrate in this chapter on one outstanding problem of modern chemistry and theoretical biology, that of ligand–receptor recognition, which is the underlying mechanism of action of most drugs. Using this problem as a molecular modeling paradigm, we introduce major molecular modeling concepts and tools that one can apply to drug design. To provide a practical example of a drug design project and make the discussion somewhat didactic, we consider the dopaminergic ligand–receptor system, which has been the focus of our research in recent years [13–15]. We believe that by considering one particular drug design problem we can better illustrate the power and limitations of underlying techniques and at the same time provide a practical introduction to the major concepts of molecular modeling as applied to drug design.

II.RELATIONSHIPS BETWEEN EXPERIMENTAL AND THEORETICAL APPROACHES TO STUDYING DRUG–RECEPTOR INTERACTIONS

The ultimate goal of molecular modeling as a pharmacological and medicinal chemistry tool is to predict, in advance of any laboratory testing, novel biologically active compounds. Molecular modeling research starts from the analysis of experimental observables of drug–receptor interaction. This interaction leads to the formation of the ligand–receptor complex followed by the conformational change of the receptor, which constitutes the putative mechanism of signal transduction. One can obtain either ligand–receptor binding constants from direct measurements in vitro or the biological activity data measured on isolated tissues or on the whole organism. Thus, experimental observables of ligand– receptor systems include the chemical structure of the ligands, biological activity or binding constants, receptor primary sequence, and, in some cases, receptor 3D structure (Fig. 1).

Molecular modeling uses the experimental observations as input data to programs that afford rational analysis of structure–activity relationships. As mentioned above, there are two main approaches to drug design: ligand-based and structure-based methods. First, if the receptor structure has been characterized by either X-ray crystallography or NMR, computer graphics methods can be used to design novel compounds based on complemen-

Computer Aided Drug Design

353

Figure 1 A flow chart of experimental information about drug–receptor interaction. The objects of experimental and theoretical investigation are boxed, and the experimental information is circled.

tarity between the surfaces of the ligand and the receptor binding site. Energy calculations may also be applied to quantitate and refine the graphics design. Second, if the receptor structure is unknown, one is left to infer the shape of the receptor binding site. Several receptor mapping techniques have been developed, such as AAA [3] and CoMFA [5]. The idea of these approaches is to construct a pharmacophore that integrates all the important structural and physicochemical features of the active compounds required for their successful binding to a receptor. Although the pharmacophore itself has significant predictive power as far as rational drug design is concerned, the design can be greatly facilitated if the receptor model is available. In many cases structural information about the receptor can be obtained experimentally by the means of X-rays or NMR. However, in many instances, especially for transmembrane receptors such as G-protein coupled receptors (GPCR), experimental structural determination is difficult if not impossible. Thus, theoretical structure modeling remains the only possible source of structural information about many receptors.

Recent successes in deciphering primary sequences of many receptors and developing several computerized protein-modeling tools enabled researchers to devise putative three-dimensional models of these receptors from their primary sequences. The development of molecular docking methodologies [16,17] (which allow one to bring together the ligand and the receptor and ‘‘dock’’ the former into the latter) complete the round of computer representations of natural processes of ligand–receptor interaction (i.e., a phar- macophore-based conformationally constrained ligand is brought into the vicinity of the modeled receptor binding site via docking).

The interaction between ligands and their receptors is clearly a dynamic process. Once the static model of ligand–receptor interaction has been obtained, the stability of ligand–receptor complexes should be evaluated by means of molecular dynamics simulations [18].

One can see that at this point only two types of experimental data are used for the analysis of a ligand–receptor system: ligand chemical structure (from which one generates the 3D pharmacophore conformation) and the primary receptor sequence (which is used to generate the 3D receptor model). The most important property of the receptors, however, is their ability to discriminate ligands on the basis of their chemical structure; we quantify this as ligand binding constants. The ability to reproduce binding constants, or at least their relative order, is the most sensitive test of any putative receptor model. The technique that, in principle, can provide such data is the free energy simulations/thermodynamic

354

Tropsha and Zheng

Figure 2 A flow chart of theoretical modeling drug–receptor interaction and relation to experiment. The objects of theoretical investigation are in rectangles, and the experimental information is in the ovals.

cycle approach (see Chapter 9), a method that relies on full atomistic representations of both ligand and receptor in the molecular model [19]. This approach is very robust but requires substantial computational resources, which limits its practical applicability for drug design. Nevertheless, in several reported cases, it was able to reproduce accurately the experimental binding constants [20,21]. The entire modeling process, including tools used for evaluation and refinement of the putative ligand–receptor model, is summarized in Figure 2.

The successful realistic modeling of ligand–receptor interactions can only be based on a combined approach incorporating several computer-assisted modeling tools. As outlined above, these tools may include ligand-based ‘‘negative image’’ receptor modeling, QSAR, primary structure-based ‘‘real’’ receptor modeling, docking, structure refinement (molecular dynamics), and relative binding constant calculations. We now discuss these techniques in the order in which one can proceed while working on a receptor model.

III.COMPUTATIONAL APPROACHES TO MODELING LIGAND–RECEPTOR INTERACTIONS

A. Ligand-Based Approaches

The advent of faster computers and the development of new algorithms led in the late 1970s and early 1980s to the development and active use of 3D negative receptor image modeling tools such as AAA. In this approach, first proposed by Marshall et al. [3], the researcher infers the size, the shape, and some physicochemical parameters of the receptor binding site by modeling receptor ligands. The key assumption of this approach is that all the active receptor ligands adopt, in the vicinity of the binding site, conformations that

Computer Aided Drug Design

355

present common pharmacophoric functional groups to the receptor. Thus, the receptor is thought of as a negative image of the generalized pharmacophore that incorporates all structural and physicochemical features of all active analogs overlapped in their pharmacophoric conformations. The latter are found in the course of a conformation search, starting from the most rigid active analogs and proceeding with more flexible compounds. Conformational restraints imposed by more rigid compounds(s) with respect to the internal geometry of common functional groups are used to facilitate the search for more flexible compounds. The AAA and similar approaches have been successfully applied to negative image modeling of many receptors, yielding important leads for rational drug design (see, e.g., Ref. 14). Most important, these models can be incorporated into a common database providing the source of ligand structures for pharmaceutical lead generation (see, e.g., Ref. 22). Furthermore, the same database can be used to search for potential specific activity or cross-reactivity of independently designed or synthesized ligands by comparing them with known 3D pharmacophores.

For illustration, let us consider the application of AAA to an important pharmacological class of dopaminergic ligands [13]. The analysis of pharmacological activity of D1 receptor ligands leads to their classification as active or inactive (see Table 1 and Fig. 3). Six compounds were chosen as active on the basis of three criteria: They had affinity for the D1 receptor (K0.5 300 nM); they could increase cAMP synthesis in rat striatal membranes to the same degree as dopamine; and this increase could be blocked completely by the D1 antagonist SCH23390. The compounds that met these criteria were dopamine, DHX, SKF89626, SKF82958, A70108, and A77636. As shown in Table 1, all of these compounds caused complete activation of dopamine-sensitive adenylate cyclase in this preparation and had K0.5 values ranging from 267 nM (dopamine) to 0.9 nM (A70108).

The molecular modeling studies with this set of dopaminergic ligands involved the following analytical steps:

1.The tentative pharmacophoric elements of the D1 receptor were determined on the basis of known structure–activity relationships.

2.A rigorous conformational search on the active compounds was performed to determine their lowest energy conformation(s).

Table 1 Pharmacological Analysis of DHX and Related Compounds

 

 

 

 

Max. stimulation of

 

D1 affinity

D2 affinity

Adenylate

adenylate cyclase

Drug

K0.5 (nM)

K0.5 (nM)

cyclase EC50 (nM)

(% vs. DA)

 

 

 

 

 

Dopamine

267

36

5000

100

( )DHX

2.3

43.8

30

120

SKF89626

61

142

700

120

SKF82958

4

73

491

94

A70108

0.9

41

1.95

96

A77636

31.7

1290

5.1

92

cis-DHX

103

103

104

17

N-Propyl-DHX

326

27

104

36

N-Allyl-DHX

328

182

104

32

Ro 21-7767

477

61

104

22

N-Benz-5,6-ADTN

103

103

103

38

N-Benz-6,7-ADTN

103

335

104

25

 

 

 

 

 

356

Tropsha and Zheng

Figure 3 The chemical structures of the ligands used in the molecular modeling study of the D1 dopamine receptor. The ligands were divided into two groups (active and inactive) based on their pharmacological properties. The hypothesized pharmacophoric elements are shown in bold.

3.Conformationally flexible superimposition of these compounds was done to determine their common (pharmacophoric) conformation.

4.Similar conformational analyses were performed for inactive compounds, and inactive compounds in pharmacophoric conformations were superimposed with the active compounds to determine steric limitations in the active site. Where appropriate, the geometry of each inactive molecule was obtained by modifying the chemical structure of the relevant active analogs followed by the energy minimization of the resulting structure.

5.Finally, an evaluation was made of excluded receptor volume and shape as spatial equivalents of the volume and shape of the pharmacophore.

All molecular modeling studies were performed with the multifaceted molecular modeling software package SYBYL (Tripos Associates Inc., St. Louis, MO).

Computer Aided Drug Design

357

Based on the experimental SAR data, the following functional groups of agonists were defined as key elements of the D1 agonist pharmacophore (Fig. 3): the two hydroxyl groups of the catechol ring, the nitrogen, and (except for dopamine) an accessory hydrophobic group (e.g., the aromatic ring in dihydrexidine or SKF82958). Thus, the task of molecular modeling analysis was to identify a pharmacophoric conformation for each compound in which these key pharmacophoric elements were spatially arranged in the same way for all active compounds.

(a) Lowest Energy Conformations of Active Compounds. Construction of the Pharmacophore. The evaluation of the D1 agonist pharmacophore was based on the following three-step routine.

Step 1. Conduct a conformational search on each of the agonists to identify their lowest energy conformation(s).

Step 2. Find common low energy conformations for all of the compounds. The commonality was assessed by comparing the distances between each of the hydroxyl oxygens and the nitrogen and the angle between the planes of the catechol ring and the accessory ring.

Step 3. Overlay all agonists in their most common conformations using dihydrexidine as a template compound by superimposition of equivalent pharmacophoric atoms of all the agonists and those of DHX.

(b) D1 Receptor Mapping and Agonist Pharmacophore Development. The ‘‘pharm’’ configurations of the active molecules also were used to map the volume of the receptor site available for ligand binding. The steric mapping of the D1 receptor site, using the MVolume routine in SYBYL, involved the construction of a pseudoelectron density map for each of the active analogs superimposed in their pharmacophore conformations. A union of the van der Waals density maps of the active compounds defines the receptor excluded volume [3].

The essential feature of the AAA is a comparison of active and inactive molecules. A commonly accepted hypothesis to explain the lack of activity of inactive molecules that possess the pharmacophoric conformation is that their molecular volume, when presenting the pharmacophore, exceeds the receptor excluded volume. This additional volume apparently is filled by the receptor and is unavailable for ligand binding; this volume is termed the receptor essential volume [3]. Following this approach, the density maps for each of the inactive compounds (in their ‘‘pharm’’ conformations superimposed with that of active compounds) were constructed; the difference between the combined inactive compound density maps and the receptor excluded volume represents the receptor essential volume. These receptor-mapping techniques supplied detailed topographical data that allowed a steric model of the D1 receptor site to be proposed.

These modeling efforts relied upon dihydrexidine as a structural template for determining molecular geometry because it was not only a high affinity full agonist, but it also had limited conformational flexibility relative to other more flexible, biologically active agonists. For all full agonists studied (dihydrexidine, SKF89626, SKF82958, A70108, A77636, and dopamine), the energy difference between the lowest energy conformer and those that displayed a common pharmacophore geometry was relatively small ( 5 kcal/mol). The pharmacophoric conformations of the full agonists were also used to infer the shape of the receptor binding site. Based on the union of the van der Waals density maps of the active analogs, the excluded receptor volume was calculated. Various inactive analogs (partial agonists with D1 K0.5 300 nM) subsequently were used to define

358

Tropsha and Zheng

Figure 4 Excluded volume for the D1 agonist pharmacophore. The mesh volume shown by the black lines is a cross section of the excluded volume representing the receptor binding pocket. Dihydrexidine (see text) is shown in the receptor pocket. The gray mesh represents the receptor essential volume of inactive analogs. The hydroxyl binding, amine binding, and accessory regions are labeled, as is the steric occlusion region.

the receptor essential volume (i.e., sterically intolerable receptor regions). These volumes, together with the pharmacophore results, were integrated into a three-dimensional model estimating the D1 receptor active site topography (see Fig. 4).

Figure 4 represents the typical result of the application of the active analog approach. Based on the steric description of the essential receptor volume, new ligands can be designed that fit geometrical descriptions of the pharmacophore. Indeed, recent research based on the active analog approach led to the design of a novel highly selective dopamine agonist dihydrexidine [14] shown in Figure 4.

B. Quantitative Structure–Activity Relationship Method

The quantitative structure–activity relationship (QSAR) approach was first introduced by Hansch and coworkers [23,24] on the basis of implications from linear free energy relationships (LFERs) in general and the Hammett equation in particular [25]. It is based upon the assumption that the difference in physicochemical properties accounts for the differences in biological activities of compounds. According to this approach, the changes in physicochemical properties that affect the biological activities of a set of congeners are of three major types: electronic, steric, and hydrophobic [26]. These structural properties are often described by Hammett electronic constants [27], Verloop STERIMOL parameters [28], hydrophobic constants [27], etc. The quantitative relationships between biological activity (or chemical property) and the structural parameters could be conventionally obtained by using multiple linear regression (MLR) analysis.

The typical QSAR equation is

log C1 b0 bi Di

i

Computer Aided Drug Design

359

where C is drug concentration, b0 is a constant, bi is a regression coefficient, and Di is a molecular descriptor such as the Hansch π parameter (hydrophobicity) or the Hammett σ parameter (electron-donating or accepting properties), or molar refractivity (MR), which describes the volume and electronic polarizability, etc. The fundamentals and applications of this method in chemistry and biology have been summarized by Hansch and Leo [1]. This traditional QSAR approach has generated many useful and, in some cases, predictive QSAR equations and has led to several documented drug discoveries (see, e.g., Ref. 6 and references therein).

Many different approaches to QSAR have been developed since Hansch’s seminal work. These include both two-dimensional (2D) and 3D QSAR methods. The major differences among these methods can be analyzed from two viewpoints: (1) the structural parameters that are used to characterize molecular identities and (2) the mathematical procedure that is employed to obtain the quantitative relationship between a biological activity and the structural parameters.

Most of the 2D QSAR methods are based on graph theoretic indices, which have been extensively studied by Randic [29] and Kier and Hall [30,31]. Although these structural indices represent different aspects of molecular structures, their physicochemical meaning is unclear. Successful applications of these topological indices combined with multiple linear regression (MLR) analysis are summarized in Ref. 31. On the other hand, parameters derived from various experiments through chemometric methods have also been used in the study of peptide QSAR, where partial least square (PLS) [32] analysis has been employed [33].

With the development of accurate computational methods for generating 3D conformations of chemical structures, QSAR approaches that employ 3D descriptors have been developed to address the problems of 2D QSAR techniques, e.g., their inability to distinguish stereoisomers. The examples of 3D QSAR include molecular shape analysis (MSA) [34], distance geometry [35,36], and Voronoi techniques [37].

Perhaps the most popular example of 3D QSAR is CoMFA. Developed by Cramer et al. [5], CoMFA elegantly combines the power of molecular graphics and partial least squares (PLS) technique and has found wide applications in medicinal chemistry and toxicity analysis (see Ref. 38 for several excellent reviews). This method is one of the most recent developments in the area of ligand-based receptor modeling. This approach combines traditional QSAR analysis and three-dimensional ligand alignment central to AAA into a powerful 3D QSAR tool. CoMFA correlates 3D electrostatic and van der Waals fields around sample ligands typically overlapped in their pharmacophoric conformations with their biological activity. This approach has been successfully applied to many classes of ligands [38].

CoMFA methodology is based on the assumption that since, in most cases, the drug– receptor interactions are noncovalent, the changes in biological activity or binding constants of sample compounds correlate with changes in electrostatic and van der Waals fields of these molecules. To initiate the CoMFA process, the test molecules should be structurally aligned in their pharmacophoric conformations; the latter are obtained by using, for instance, the AAA described above. After the alignment, steric and electrostatic fields of all molecules are sampled with a probe atom, usually an sp3 carbon bearing a1 charge, on a rectangular grid that encompasses structurally aligned molecules. The values of both van der Waals and electrostatic interaction between the probe atom and all atoms of each molecule are calculated in every lattice point on the grid using the force field equation described above and entered into the CoMFA QSAR table. This table thus

360

Tropsha and Zheng

contains thousands of columns, which makes it difficult to analyze; however, application of special multivariate statistical analysis routines such as partial least squares analysis, cross-validation, and bootstrapping ensures the statistical significance of the final CoMFA equation. A cross-validated R2 (q2) that is obtained as a result of this analysis serves as a quantitative measure of the quality and predictive ability of the final CoMFA model. The statistical meaning of the q2 is different from that of the conventional r2; a q2 value greater than 0.3 is considered significant.

Recent trends in both 2D and 3D QSAR studies have focused on the development of optimal QSAR models through variable selection. This implies that only a subset of available descriptors of chemical structures that are the most meaningful and statistically significant in terms of correlation with biological activity is selected. The optimum selection of variables is achieved by combining stochastic search methods with correlation methods such as MLR, PLS analysis, or artificial neural networks (ANN) [39–44]. More specifically, these methods employ either generalized simulated annealing [39], genetic algorithms [40], or evolutionary algorithms [41–44] as the stochastic optimization tool. Since the effectiveness and convergence of these algorithms are strongly affected by the choice of a fitting function, several such functions have been applied to improve the performance of the algorithms [41,42]. It has been demonstrated that combinations of these algorithms with various chemometric tools have effectively improved the QSAR models compared to those without variable selection.

The variable selection approaches have also been adopted for region selection in the area of 3D QSAR. For example, GOLPE [45] was developed using chemometric principles. q2-GRS [46] was developed based on independent CoMFA analyses of small areas (or regions) of near-molecular space to address the issue of optimal region selection in CoMFA analysis. More recently, a genetic algorithm based sampling of 3D regions of CoMFA fields was implemented [47]. These methods have been shown to improve the QSAR models compared to the original CoMFA technique.

Most of the QSAR techniques (both 2D and 3D) assume the existence of a linear relationship between a biological activity and molecular descriptors, which may be an adequate assumption for relatively small datasets (dozens of compounds). However, the fast collection of structural and biological data, owing to recent development of combinatorial chemistry and high throughput screening technologies, has challenged traditional QSAR techniques. First, 3D methods may be computationally too expensive for the analysis of a large volume of data, and in some cases, an automated and unambiguous alignment of molecular structures is not achievable [48], Second, although existing 2D techniques are computationally efficient, the assumption of linearity in the SAR may not hold true, especially when a large number of structurally diverse molecules are included in the analysis. Thus, several nonlinear QSAR methods have been proposed in recent years. Most of these methods are based on either artificial neural network (ANN) [49] or machine learning techniques [50]. Such applications, especially when combined with variable selection, represent fast-growing trends in modern QSAR research.

It is important not to overinterpret the QSAR models in terms of the underlying mechanism of the ligand–receptor interaction or their value for the design of novel compounds with high biological activity. Successful QSAR models present a statistically significant correlation between chemical features of compounds (descriptors) and biological activity, which does not imply any particular mechanism of drug action. In some instances, such as when CoMFA or simple linear regression QSAR models are used with clearcut physicochemical descriptors (e.g., van der Waals and electrostatic molecular fields

Computer Aided Drug Design

361

or hydrophobicity), the models could be formally interpreted in terms of the structural modifications required to increase the biological activity. However, it is essential in all cases to avoid forecasting biological activity for compounds too structurally different from those in the training set.

C. Structure-Based Drug Design

Ligand-based methods of drug design provide only limited information about the stereochemical organization of the receptor binding site. Detailed knowledge of the active site geometry can be obtained experimentally by means of X-ray crystallography or NMR; unfortunately, despite rapid progress in practical applications of these techniques, the structures of very few receptors and ligand–receptor complexes are available experimentally. In the absence of any experimental information on the 3D structure of many receptors, predictions from primary sequence remain the only means of generating the receptor structure (cf. Chapter 14). Obviously, the knowledge of the active site geometry significantly facilitates the design of new drugs by providing real spatial and chemical limitations on the structures of newly designed molecules. We briefly review in the following subsections first the methods of molecular simulations applicable to the analysis of known ligand–receptor complexes and then the current state of the art in the area of modeling GPCRs, which include dopamine receptors. Excellent reviews of methods and successes in the area of structure-based drug design were published in the early 1990s [7,51].

1. Ligand–Receptor Docking

In ligand–receptor docking, the ligand is brought (manually or in some automated way) into the vicinity of the binding site and oriented so that electrostatic and van der Walls interactions (which correspond to Coulomb and dispersion terms, respectively, in molecular mechanics energy expressions) between ligand and receptor are optimized. Pharmacophoric conformation of ligands is frequently used in these studies. Earlier docking programs allowed the docking of only a rigid ligand into a rigid receptor, i.e., no conformational change of either receptor or ligand was permitted as the latter approached the former. Clearly, this is not the way the interaction occurs in nature, because both ligand and receptor are relatively flexible molecules that adjust their conformation to each other in the process of binding to maximize steric and chemical complementarity. Flexible docking calculations are very difficult and require an extremely fast computer. Nevertheless, the consideration of both ligand and receptor active site flexibility is becoming part of modern docking approaches [52], Further descriptions of docking algorithms can be found in several publications [16,17,53].

2. Molecular Dynamics

Once the model of a ligand–receptor complex is built, its stability should be evaluated. Simple molecular mechanics optimization of the putative ligand–receptor complex leads only to the identification of the closest local minimum. However, molecular mechanics optimization of molecules lacks two crucial properties of real molecular systems: temperature and, consequently, motion. Molecular dynamics studies the time-dependent evolution of coordinates of complex multimolecular systems as a function of interand intramolecular interactions (see Chapter 3). Because simulations are usually performed at normal temperature ( 300 K), relatively low energy barriers, on the order of kT (0.6 kcal), can

362

Tropsha and Zheng

be easily overcome. Thus if the starting configuration of the whole system (i.e., drug– receptor complex) resulting from docking is separated from the more stable configuration by such a low barrier, molecular dynamics will take the system over the barrier. Molecular simulations may identify more stable, therefore more realistic, conformational states of ligand–receptor complexes. Furthermore, they may provide unique information about conformational changes of the receptor due to ligand binding. They may shed light on the intimate mechanisms of receptor activation that currently cannot be studied by any other technique. Finally, molecular simulations frequently incorporate solvent and thus allow the inclusion of solvent effects in the consideration. Unfortunately, due to the inherently very short elementary simulation step size, 2 fs, this technique is presently limited to relatively short total simulation times, on the order of hundreds of picoseconds to nanoseconds. These limitations are mainly due to the fact that available computer power is still inadequate for significantly longer simulation times.

3. Binding Constant Calculation

The combination of free energy simulation (FES) and the thermodynamic cycle (TC) approach is the most promising modern technique for calculating relative ligand–receptor binding constants by simulating ligand–receptor interaction [19]. Experimentally, equilibrium free energies of binding of two ligands to the same receptor are evaluated in two independent experiments, according to the following scheme.

Ligand1 Receptor ∆G° Ligand1/Receptor

1

Ligand2 Receptor ∆G° Ligand2/Receptor

2

To calculate the relative binding constants of the two ligands, Ligand1 and Ligand2, we construct the following cyclic scheme,

Ligand1 Receptor ∆G° Ligand1/Receptor

1

 

 

G3°

G4°

 

 

Ligand2 Receptor ∆G° Ligand2/Receptor

2

where ∆G° and ∆G° correspond to the experimental binding free energies of Ligand1 and

1 2

Ligand2, respectively, and ∆G° and ∆G° correspond to the free energy of the formal

3 4

transformation of the chemical structure of Ligand1 into that of Ligand2 in solution and in the binding site, respectively. From a thermodynamic viewpoint, the foregoing scheme represents a closed thermodynamic cycle that consists of four transformations: the bind-

ing of Ligand1 in solution to the receptor (∆G°); the chemical transformation of bound

1

Ligand1 to bound Ligand2 (∆G°); the chemical transformation of Ligand1 in solution

4

to Ligand2 in solution (∆G°); and the binding of Ligand2 in solution to the receptor

3

(∆G°). Using the thermodynamic cycle (TC) relationship, ∆∆G° 0, one obtains the

2 cycle

difference in the free energy of binding of Ligand1 and Ligand2, i.e., (∆G° ∆G°). Thus,

1 2

∆∆G° ∆G° ∆G° ∆G° ∆G° 0

cycle 1 4 3 2

so

∆∆G° G° ∆G° ∆G° ∆G°

binding 1 2 3 4

Computer Aided Drug Design

363

The latter two free energies of chemical transformation are computed in the course of FES. The advantage of this approach is that it avoids calculations of ligand–receptor

binding free energy per se, i.e., ∆G° and ∆G°, which would be extremely computationally

1 2

intensive, without sacrificing the theoretical rigor of calculation of binding constants from molecular simulations of ligand–receptor interaction. Faster free energy simulation approaches based on the linear response theory have been introduced to improve the computational efficiency of free energy simulations [54–56]. Despite its computational intensity, the FES–TC approach finds important and growing applications in the analysis of ligand– receptor interactions and drug design [57–60].

D.Chemical Informatics and Drug Design

Combinatorial chemical synthesis and high throughput screening have significantly increased the speed of the drug discovery process [61–63]. However, it is still impossible to synthesize all of the library compounds in a reasonably short period of time. As many as 30003 (2.7 1010) compounds can be synthesized from a molecular scaffold with three different substitution positions when each of the positions has 3000 different substituents. If a chemist could synthesize 1000 compounds per week, 27 million weeks ( 0.5 million years) would be required to synthesize all of these compounds. Furthermore, many of these compounds can be structurally similar to each other, and chemical information contained in the library can be redundant. Thus, there is a need for rational library design (i.e., rational selection of a subset of building blocks for combinatorial chemical synthesis) so that a maximum amount of information can be obtained while a minimum number of compounds are synthesized and tested. Similarly, there is a closely related task in computational database mining, i.e., rational sampling of a subset of compounds from commercially available or proprietary databases for biological testing. Thus, new challenges to computational drug design in the context of combinatorial chemistry include information management, rational library design, and database analysis. These are the research topics of chemoinformatics—a new area of computational chemistry.

Chemoinformatics (or cheminformatics) deals with the storage, retrieval, and analysis of chemical and biological data. Specifically, it involves the development and application of software systems for the management of combinatorial chemical projects, rational design of chemical libraries, and analysis of the obtained chemical and biological data. The major research topics of chemoinformatics involve QSAR and diversity analysis. The researchers should address several important issues. First, chemical structures should be characterized by calculable molecular descriptors that provide quantitative representation of chemical structures. Second, special measures should be developed on the basis of these descriptors in order to quantify structural similarities between pairs of molecules. Finally, adequate computational methods should be established for the efficient sampling of the huge combinatorial structural space of chemical libraries.

There are two types of experimental combinatorial chemistry and high throughput screening research directions: targeted screening and broad screening [61,62]. The former approach involves the design and synthesis of chemical libraries with compounds that are either analogs of some active compounds or can specifically interact with the biological target under study. This is desired when a lead optimization (or evolution) program is pursued. On the other hand, a broad screening project involves the design and synthesis of a large array of maximally diverse chemical compounds, leading to diverse (or universal)