Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Solid-Phase Synthesis and Combinatorial Technologies

.pdf
Скачиваний:
15
Добавлен:
15.08.2013
Размер:
7.21 Mб
Скачать

5.4 LIBRARY DESIGN VIA COMPUTATIONAL TOOLS 193

virtual tripeptide

a

A

O

D

O

N

 

N

 

library

 

N

NH2

 

H

 

 

 

 

 

 

 

 

 

B

O

 

 

 

 

 

b

 

 

 

 

 

E

c

F

O

 

 

 

O

 

 

 

N

 

N

NH2

 

 

 

H

N

 

 

 

 

 

B

O

 

 

 

e

 

d

 

 

 

 

 

 

 

 

A

O

D O

 

E O

 

F O

 

 

 

 

N

 

N

 

N

 

N

N

NH

H

N

NH2

H

 

 

 

 

2

 

B

O

 

B

O

 

 

 

 

 

 

 

f

 

O

 

 

O

N

 

 

 

 

 

N

 

 

 

H

 

 

N

 

 

NH2

 

 

 

 

a: random selection of a virtual structure;

 

 

 

O

b: similarity measurement (based on 5.3);

 

 

 

5.3

c: random replacement of 1/2 monomers

 

 

 

d,e: selection of the most similar virtual

 

 

 

 

 

 

structure to 5.3

 

 

 

 

 

 

f: repeating a-e N times for 4 cycles

 

 

 

 

 

 

CYCLE1: N=30

 

 

 

 

 

 

CYCLE2: N=48

 

 

 

 

 

 

CYCLE3: N=57

 

 

 

 

 

 

CYCLE4: N>60

 

 

 

 

 

 

Figure 5.23 Virtual screening of a 15,625-membered tripeptoid virtual library using SA-based selection: 5.3 as a lead compound for computational selection.

194 SYNTHETIC ORGANIC LIBRARIES: LIBRARY DESIGN AND PROPERTIES

 

 

R1 O

R3 O

 

 

 

 

 

N

N

 

 

 

 

H

N

NH2

 

 

 

 

 

R2

O

 

 

 

 

 

R1-3 = 25 different alkyls or aryls

 

 

 

 

15,625 virtual library components

 

 

 

 

OH

Me

S

HO

 

 

 

 

 

 

 

 

 

O

O

 

O

 

 

 

H

H

OH

 

 

 

N

N

 

 

 

H2N

N

 

N

 

H

N

 

H

 

H

HO

Me

 

O

O

O

 

 

 

 

 

 

5.4

 

 

H

5.5

 

SELECTION using 5.4 as a LEAD: RECURRENT FRAGMENTS (10)

O

OH

COOH

 

C

I

G

 

Me

 

A D

OMe

H

OMe

J

E

 

N O

OMe

 

 

O

NH2

B

O

F

 

SELECTION using 5.5 as a LEAD: RECURRENT FRAGMENTS (5)

O

B G I

O

K L

Figure 5.24 Virtual screening of a 15,625-membered tripeptoid virtual library using SA-based selection: met-enkephalin 5.4 and morphine 5.5 as lead compounds for computational selection.

binding activity, and for each performed four series of 40 SA similarity-based selection cycles on the same 15,625-membered virtual tripeptoid library. The results of these series, expressed as the occurrence of each building block in the 50 most similar structures found, are reported for both lead structures in Fig. 5.24. The two nonpeptoid leads produced complementary patterns, and the selection of the 12 building blocks A–L with a statistically significant occurrence would have found all the three active tripeptoids (Fig. 5.25) out of the original library tested for -opiate activity (109). A

 

 

 

 

 

 

 

 

 

 

 

5.4 LIBRARY DESIGN VIA COMPUTATIONAL TOOLS

195

A

O

 

 

I

O

 

 

 

 

 

 

 

 

 

A

O

 

 

I

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

 

 

N

 

 

N

 

 

NH2

 

 

 

 

 

 

 

 

 

N

 

 

 

 

 

N

 

 

 

H

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

 

 

 

 

NH

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

 

 

L

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

 

 

 

O

 

 

A

O

 

 

I

O

 

 

J

O

 

 

 

 

 

5.6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5.8

 

 

 

 

 

 

 

 

 

 

6 nM

 

 

 

 

 

 

N

 

 

N

 

 

N

 

 

NH2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

31 nM

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-opiate

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

 

-opiate

 

 

 

 

 

 

 

 

 

 

 

 

 

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5.7

46 nM-opiate

SELECTION using 5.4 (met-enkephalin) as a LEAD structure: A, D, I, J (4 out of 5) detected

SELECTION using 5.5 (morphine) as a LEAD structure: I, L (2 out of 5) detected

SELECTION using 5.4 and 5.5 as LEAD STRUCTURES: 12 substituents detected, 5 out of 5 from-opiate positives

Figure 5.25 Virtual screening of a 15,625-membered tripeptoid virtual library using SA-based selection: final outcome leading to nonpeptoidic leads.

library of 123 = 1728 compounds would have thus been sufficient to cover the significant activity/similarity profile embedded in the whole virtual library (253 = 15,625 virtual components). Another report highlighting the potential of simulated annealing guided evaluation (SAGE) in library design has recently been published (110).

Two other examples of rational design of focused libraries are related to the exploitation of target information, and use either Genetic Optimization for Ligand Docking (GOLD) (111–113) to screen virtual combinatorial libraries for their docking with a partially flexible target using a genetic algorithm; or the ligand design program LUDI (114, 115) to map the active site of a target and to extract the structural information needed to build a target-focused small library.

Jones et al. (113) selected as target a lipase from Candida rugosa with known active site coordinates (116) and built a 44,730-member virtual library of amides L1 from commercially available acids (M1, 426 candidates) and amines (M2, 105 candidates) (Fig. 5.26). The selection was performed on reagents, as the library size would have prevented the docking of each reaction product; each monomer, though, was “adjusted” to resemble the final amides by converting acids M1 to N-methylamides M1R and amines M2 to acetylamino compounds M2R (a, Fig. 5.26). Each adjusted monomer was docked into the lipase active site using GA to select the ideal GOLD fit; three 50,000 operation-GA runs per monomer were performed to select three different fits, unless the first two results were already extremely similar (b, Fig. 5.26). A total of 1076 GOLD fits for the acids and 270 fits for amines were identified. The fits were then merged to build the more meaningful amides from the virtual library: merging was done deleting the overlapping parts of M1R and M2R and joining the remaining fragments, providing that they were in the correct orientation to be linked and that the resulting amides did not clash with the active site structure. Only 311 combinations

196 SYNTHETIC ORGANIC LIBRARIES: LIBRARY DESIGN AND PROPERTIES

COOH

NH2

R1

R2

M1

M2

426 candidates

105 candidates

 

a

CONHMe

NHCOMe

R2

R

1

 

M1R

M2R

426 candidates

105 candidates

 

b

M1R GOLD fits: 1076

M2R GOLD fits: 270

c

311 non-clashing M1/M2 fit superimpositions

d

237 energy-reasonable M1/M2 fit superimpositions

e

129 unique M1/M2 fit superimpositions

M1

M2

34 selected

49 selected

 

O

R1

 

 

R2

 

 

 

 

N

 

 

 

H

L1

44,730-member virtual library of amides

304,640 virtual combinations

of GOLD fits

 

O

R1

 

 

R2

 

 

 

 

N

 

 

 

H

L2

129-member selected library of amides

a: modification of monomer structures; b: selection of three GOLD fits per monomer by GA (50,000 operations per modified monomer); c: merging of fits and elimination of clashing superimpositions;

d: energy filtering (<500 kcal/mol); e: redundancy filtering (only unique structures kept).

Figure 5.26 Virtual screening of a combinatorial amide library L1 targeted towards Candida rugosa lipase using Genetic Optimization for Ligand Docking (GOLD).

out of the theoretical 304,640 possible superimpositions of M1R and M2R survived this step (c, Fig. 5.26). Among them only the ones with reasonable energies (cut-off of 500 kcal/mol, d, Fig. 5.26) and representing unique compounds (e, Fig. 5.26) were retained, eventually affording a 129-member library L2 from 34 acids M1 and 49 amines M2. The process required 159 hours of CPU time, and allowed both a >30 reduction in the actual versus virtual library size and a significant reduction in the

5.4 LIBRARY DESIGN VIA COMPUTATIONAL TOOLS 197

number of actual versus virtual monomers (>10 for acids, >2 for amines); a confirmation of the virtual screening results via resynthesis and testing of the selected individuals would be crucial to validate the process. The authors cross-checked computationally the results by docking the 129 selected compounds into the lipase active site using a more rigorous GA-selection procedure to find the preferred docking conformations (10 GA runs per compound, 100,000 operations per run); the vast majority (>85%) of compounds gave a good correlation between reagent-based and product-based binding mode.

Lew and Chamberlin (115) selected the human T cell Kv1.3 potassium channel as a target for novel immunosuppressive agents (117). Using LUDI (114, 115) the target active site was screened by docking into it over 1,000 random structural fragments (e.g., amines, acids, phenyl rings); the essential active site residues for specific Kv1.3 blockade and the optimal location of hydrophobic, hydrophilic and hydrogen bond donor/acceptor groups in the site were determined (see ref. 115 for more details). The specificity of the built model was successfully validated by docking known, selective Kv1.3 blockers (complete match with the LUDI model) and aspecific blockers (partial mismatch). The identified fragments were then connected respecting their orientation and distances; this resulted in the selection of a phenyl stilbene scaffold (5.9, Fig. 5.27) as a compromise between ideal docking, restricted conformational freedom and reasonable binding energies. The general fit of the scaffold to the target active site was exploited by combinatorial modification of three randomization points (R1–R3), and a synthetic scheme for a scaffold 5.9-derived library employing commercially available reagents and monomers was designed and assessed (Fig. 5.27). A 400-member discrete library L3 was prepared (Fig. 5.28) and screened discovering several weakly active channel blockers (e.g., 5.10–5.13, Fig. 5.28); a preliminary, but assessed SAR for the phenyl stilbene scaffold which could be used for further optimizations was also obtained.

Computational design/selection tools for focused libraries can significantly reduce the efforts required to optimize preliminary structural information while ensuring the exploitation of the same structural input. The use of these results to refine the structural requirements and to eventually generate better focused libraries will become a key tool for combinatorial scientists working with synthetic organic libraries for different applications.

5.4.3 Biased-Targeted Libraries

Concurrent to the increasingly popular use of diverse libraries to sample chemical space and to provide chemical leads for various applications, the search for areas inside chemical space that are more promising for specific applications has also gained relevance (118–120). We will review the example of pharmaceutical research, where the so-called druglike properties of a molecule are the subject of frequent reports by various groups.

It has been observed that most of the currently available drugs are small organic molecules, and their molecular weight is typically between 250 and 600 daltons. With this constraint introduced for library design, and knowing that at least two monomer

198 SYNTHETIC ORGANIC LIBRARIES: LIBRARY DESIGN AND PROPERTIES

R1

5.9

 

R2

 

R3

 

 

 

 

O

 

 

 

 

N

 

 

 

 

 

H

 

 

 

O

 

O

 

 

 

+

 

 

R1

SC

M1

 

 

PPh3

 

 

 

N

 

 

 

 

H

 

 

 

 

 

 

Br

 

Br

 

 

 

 

 

 

 

 

O

R1

 

 

O

 

 

 

 

R1

M2

 

 

SC

M3

R3 N

 

 

 

 

 

H

 

 

activation

 

5.9

R2

 

 

 

 

 

 

 

 

R2

SC = safety-catch linker

M1 = aromatic aldehydes

M2 = aromatic boronic acids

M3 = nucleophiles

Figure 5.27 Design of an SP route to a focused SP library of phenyl stilbenes selected with the program LUDI for the Kv1.3 potassium channel.

sets are to be used for a library synthesis, each monomer must weigh 250 daltons or even less, thus strongly limiting the number of candidate monomers for each chemical class. The lipophilicity of organic molecules is also extremely important to determine their usefulness as drugs. Typically, this property, expressed as the partition coefficient of a molecule between n-octanol and water (log P), is ideal for drug discovery when it is between the logarithmic values of 2 and 4–5, handicapping both extremely hydrophilic and lipophilic building blocks. Another popular filter is the number of rotatable bonds contained in a library individual: When compounds are extremely flexible, their binding capacity, related to a specific conformation, becomes weaker so that significant libraries should always contain a reasonable degree of rigidity (possibly embedded into the scaffold, but also present from the monomer) to act effectively as a source of relevant hits. Computational chemistry may help in both evaluating the degree of flexibility of molecules and subsets and selecting compounds having predefined profiles in terms of degrees of freedom. These and other properties can

5.4

LIBRARY DESIGN VIA COMPUTATIONAL TOOLS 199

 

R1

 

L3

R2

400-member

SP discrete library

 

R3

 

O N

H

R1 include unsubstituted; 2-F; 4-F; 3-F,4-Cl; 3,4-diOBn; 3-Cl; 4-OBn

R2 include unsubstituted; 4-Cl; 4-OMe; 4-F

R3 include Me; Et; n-Pr; i-Pr; n-Bu; cyclohexyl; guanidyl; morpholyl

SELECTED CHANNEL BLOCKERS:

 

 

 

 

 

OBn

 

Cl

 

 

 

 

 

OBn

 

F

Cl

 

O

Me

 

Cl

O

N

 

N

 

 

 

 

H

 

 

 

H

IC

 

= 2.9 M

5.10

 

IC50 = 3.7 M

5.11

50

 

 

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

F

 

 

 

H

NH

 

 

 

 

 

 

N

Cl

O

N

Cl

 

O

N

 

 

 

 

 

H

 

 

 

H

NH2

 

IC50 = 8.0 M

5.13

 

IC50 = 3.9 M

 

 

 

 

 

 

 

 

5.12

Figure 5.28 Structure of the phenyl stilbene focused library L3 and of several positives from screening (5.10–5.13).

easily be predicted by software programs, as shown in several papers (75, 121–123), to aid selection among the virtual sets comprised in the druglike chemical space.

A recent report presented the so-called MultiLevel Chemical Compatibility (MLCC) to determine the similarity of any structure to known drug-compatible compounds from representative databases (124); although by definition incomplete, as all the “drug-like” space is far from being determined, this and similar tools can

200 SYNTHETIC ORGANIC LIBRARIES: LIBRARY DESIGN AND PROPERTIES

increase the confidence and the probability of success in progressing a class of drug-like compounds.

Another important class of targeted libraries is aimed at specific families of targets. Examples are kinases (125, 126), proteases (127–129), and G-protein coupled receptors (130, 131). These libraries are driven by generic structural information on the family of proteins, and the correct use of this information (creation of a loose pharmacophore, selection of monomers/products on the basis of the similarity fit into the pharmacophoric model) is similar to the process observed for focused library design.

We will describe here a specific example where Kick and co-workers (132, 33) designed and prepared a library of inhibitors of aspartyl proteases. The general structure of the library, as inspired by the transition state of the enzymatic reaction, and the three monomer sets used to prepare it (amines, R1, and carboxylic acids, R2 and R3) are reported in Fig. 5.29. The benzyl substituent originated from the structure of pepstatin, a known inhibitor of cathepsin D, the specific target for this work. The virtual monomer sets were filtered by selecting commercially available amines and acids with a maximum MW of 275 daltons. This led to a final list of around 700 amines (R1) and 1900 acylating agents (R2 and R3, sulfonyl chlorides and isocyanates were also included to give sulfonamides and ureas, respectively), and consequently to a virtual library exceeding 109 compounds. The library design started by modeling the (S)-hydroxyethylene scaffold in the enzyme active site, as for the enzyme–pepstatin complex (a, Fig. 5.30); then a number of low-energy conformations were generated for the scaffold and clustered into four families (b, Fig. 5.30). For each of these families a thorough conformational search was performed for each substituent independently (c–e, Fig. 5.30), and the conformations with R1–R2 clashes were discarded (f, Fig. 5.30). The monomers exceeding $35 per gram were removed and the 50 highest scoring components from all the conformational families were merged to create the corresponding virtual compounds (g, Fig. 5.30). The compounds were clustered (h, Fig. 5.30), and the 10 highest scoring monomers from different clusters were finally selected for each randomization point (i, Fig. 5.30) to give the biased-targeted 1000member library L4. Another 1000-member diverse library L5 was generated from the same monomer sets using clustering to select the most diverse monomers for each position (j, Fig. 5.30). This library was used to measure the effectiveness of targeted design in discovering actives on cathepsin D. The results for both libraries are reported in Table 5.1. Library L4 was more successful both in terms of the number and the

 

 

 

OH

R1

 

 

 

 

 

R3

 

 

H

N R2

 

 

 

 

 

 

 

N

 

 

 

 

 

 

 

 

 

 

 

 

 

NH2

 

COOH

COOH

 

O

O

R1

R2

 

 

 

 

 

 

 

 

 

R3

Figure 5.29 Retrosynthesis of a biased targeted library aimed at aspartyl protease inhibitors.

 

 

 

 

 

5.4

LIBRARY DESIGN VIA COMPUTATIONAL TOOLS 201

 

 

OH

R1

 

 

 

H

OH R1

 

 

 

H

 

 

R3

N

R2

 

R3

 

 

N

a

b

N

c

N

 

 

R2

 

 

 

 

 

 

 

O

 

 

 

O

 

O

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OH

R

1

 

 

OH

R1

 

 

 

 

H

 

 

 

R3

H

 

R2

 

 

R

 

N

 

R

N

 

 

N

 

 

N

 

 

3

 

 

 

 

2

 

 

 

e

f

g

 

 

 

 

 

d

 

 

 

O

 

 

 

O

O

 

 

O

 

 

 

 

 

 

 

 

 

 

 

H

 

OH

R1,1-50

 

 

 

OH

R1,1-10

 

 

 

 

 

 

 

R3,1-10

H

N

R2,1-10

R3,1-50

N

 

 

N

R2,1-50

h

i

N

 

O

 

 

 

O

 

 

 

O

 

O

 

 

 

 

 

 

 

 

L4

 

 

 

 

 

 

 

 

 

1000 discretes

 

 

 

 

 

 

 

 

 

focused selection

 

 

 

 

 

 

 

 

 

 

 

OH

R1,1-10

 

 

 

 

 

 

 

R3,1-10

H

 

R2,1-10

 

NH2

 

COOH

 

j

N

N

R1

R2

COOH

 

 

 

 

 

 

 

 

R3

 

 

 

O

O

 

 

 

 

 

 

 

 

 

L5

 

 

1000 discretes diversity selection

a:scaffold modeling in the enzyme active site; b: generation of 4 families of low energy conformations;

c:modeling of R1; d: modeling of R2; e: modeling of R3; f: removal of R1-R2 clashing conformations;

g:price cut, monomer selection and virtual library generation; h: clustering; i: second monomer selection and generation of L4; j: monomer selection by diversity and generation of L5.

Figure 5.30 Rational design of a biased-targeted library L4 and a diverse library L5 using chemical filters and clustering selection methods.

TABLE 5.1 Inhibition of Cathepsin D by Discrete Libraries L4-L6: Number of Positives at Different Inhibitor Concentrations

Concentrationa

L4b

L5b

L6c

 

 

 

 

100 nM

7

1

7

330 nM

23

3

NT

1 M

67

26

36

aConcentration of the inhibitor. b1000 compounds tested.

c39 compounds tested.

202 SYNTHETIC ORGANIC LIBRARIES: LIBRARY DESIGN AND PROPERTIES

potency of hits: Resynthesis of its hits as pure compounds produced 5.14 with a Ki = 73 ± 9 nM and several other inhibitors with K si around 100–200 nM. Library L5 produced 5.15 as its best inhibitor with a Ki = 356 ± 31 nM and other micromolar activities (Fig. 5.31). These results were then optimized by making a focused library L6 (39 discretes) using the monomers clustered together with the ones producing the most active compounds in L4. Several low nanomolar inhibitors such as 5.16 (Ki = 14

± 2 nM) and 5.17 (Ki = 18 ± 2 nM) were obtained (Fig. 5.31), and a large percentage of compounds showed significant enzyme inhibition (Table 5.1).

 

 

 

O

 

 

 

 

 

O

 

 

Cl

Cl

OH

O

 

 

 

H

 

 

OMe

 

N

N

 

Cl

N

 

 

O

 

 

 

 

 

O

 

O

O

 

 

5.14

 

 

 

 

 

73±9 nM

 

S

 

OH

 

 

 

H

 

 

 

 

H

 

 

 

 

N

N N

 

 

 

S

N

 

 

 

 

O

O

 

 

 

 

 

 

 

 

O

 

 

 

O

 

5.15

 

 

 

 

356±31 nM

 

 

 

 

O

 

 

Cl

Cl

 

O

 

 

 

OH

 

 

 

 

 

H

N

N

 

 

 

N

 

 

 

O

 

 

 

 

 

O

O

O

 

 

 

5.16

 

 

Cl

 

 

14±2 nM

 

 

 

 

 

 

 

 

 

 

 

 

Cl

 

 

Cl

 

 

OH

O

 

 

 

H

 

 

 

 

N

N

 

Cl

 

N

 

 

O

 

 

 

 

 

O

O

O

 

 

5.17

 

 

 

 

18±2 nM

 

 

Figure 5.31 Structure of cathepsin D inhibitors 5.16–5.17 from libraries L4, L5, and L6.