Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Cundari Th.R. -- Computational Organometallic Chemistry-0824704789

.pdf
Скачиваний:
76
Добавлен:
08.01.2014
Размер:
4.83 Mб
Скачать

18

Norrby

for the most important task of the electrostatic function is to reproduce intermolecular interactions.

The electrostatic parameters, together with other nonbonded parameters, should generally be set according to the specific force field rules ahead of actual refinement, and then kept fixed, at least in the initial refinement stages. However, when the new parameter set has stabilized with respect to the data, it is advantageous to allow a slight variation in the nonbonded parameters to improve the overall performance.

2.5.van der Waals Data

The types of data just listed will generally suffice for isolated molecules, if no new element types are introduced. However, applications to condensed phases, or sets incorporating unusual elements, may require additional data to fit vdW parameters. If QM data are to be used (39), very high levels of theory are required, because HF and most DFT methods do not incorporate London dispersion. Most correlated methods require huge basis sets if the vdW interactions are to be distinguishable from basis set deficiency errors. Experimental sources of data for nonbonded parameters include crystal cell constants and heats of sublimation. In all cases, the balance between vdW and electrostatic parameters is very important. However, if the metal atom is buried deep enough in coordinating ligands, direct attractive interactions may be unimportant. The repulsive vdW component may sometimes be determined from obvious strain in bulky ligands (30), but only if the force constants in the deformed moieties are known from other sources.

2.6.Quantum Mechanical Data

Quantum mechanical data can be very efficiently included in parameterizations, because no data conversion is necessary, and properties can be calculated for any point on the PES. However, it is important to realize that the goal of most force fields is to reproduce experiments, not QM results. The chosen QM level puts a limit on the attainable accuracy of the force field. Most systematic errors in the QM method will be reproduced by the force field. In particular for metal systems, it is necessary to use correlated levels, with reasonably flexible basis sets. Some DFT-based methods have proven to give excellent cost/performance ratios. Suitable theoretical levels are discussed more thoroughly by Diedenhofen et al. in Chapter 4. In all the examples given in later sections, the QM data have been obtained at the B3LYP (40) level, using an ECP (41) for the metal and at least valence double-ζ quality basis sets for all atoms.

2.7.Transition-State Data for Q2MM

In the Q2MM method, force fields describing transition states as minima are developed from QM data. Structures, charges, and relative energies of stationary

Recipe for an Organometallic Force Field

19

points can be used as is in the parameterization. However, QM-derived Hessians must be modified, because they implicitly define the curvature to be negative in the direction of the reaction coordinate. The modification involves determination of normal modes, replacement of the negative eigenvalue with a positive value, and reformation of a new Hessian from the modified eigensystem (18). After modification, the curvature is positive in all directions, thus fulfilling all criteria for a regular force field. From this point on, the derivation and use of the force field is analogous to a regular ground-state force field, except that calculated steric energies will now correspond to relative activation energies for the reaction under investigation (19,20).

3. DERIVING PARAMETERS

With all the necessary ingredients in place, the task is now to derive a reliable force field. In an automated refinement, the first step is to define in machinereadable form what constitutes a good force field. Following that, the parameters are varied, randomly or systematically (15,42). For each new parameter set, the entire data set is recalculated, to yield the quality of the new force field. The best force field so far is retained and used as the basis for new trial parameter sets. The task is a standard one in nonlinear numerical optimization; many efficient procedures exist for selection of the optimum search direction (43). Only one recipe will be covered here, a combination of Newton–Raphson and Simplex methods that has been successfully employed in several recent parameterization efforts (11,19,20,28,44).

Parameter refinement is in many ways reminiscent of geometry optimization. The same problems apply—finding a minimum, and preferably the global minimum, of a function of many variables. Progress is not as easily visualized with a parameter set as with a set of coordinates, but the main implementation difference comes from the fact that gradients are not easily available in parameterization. For data that are calculated for minima on the PES (e.g., conformational energies), analytic gradients of the data with respect to the parameters cannot be determined. Thus, optimization must rely on numerical differentiation, approximate analytical derivatives (45), or methods that don’t employ gradients. An alternative is to employ only reference data for which analytic gradients are available (32).

In simple geometry optimization, the result is sensitive to the starting geometry. A very distorted starting structure may lead to a strained high-energy optimum. A similar problem plagues automated parameter refinement. However, the problem is most serious in the initial phase of the refinement. Special techniques and frequent manual intervention may be needed until the force field has stabilized on track to the desired optimum.

20

Norrby

FIGURE 5 Penalty function and derivatives with respect to parameters.

3.1.Defining the Goal

The goal of parameter refinement may be defined simply as minimizing the deviation of all calculated data points from the corresponding reference values. This is generally done in a least squares sense, employing the penalty function depicted in Figure 5,* where yi and yºi are the calculated and reference data points, respectively, σi corrects for the quality of the reference data, ci corrects for different units of measure, and ti is the relative importance of reproducing a specific type of data. The latter three, being constants for each data point, are conveniently combined into a weighting factor wi (42). The weight factor must be set for each data point. At the very least, different types of data must be converted to a common unit of measure. If not, an error of 1° in an angle might have the same

˚

impact as an error of 1 A in a bond length!

An intuitive method for defining data weights is simply to use the inverse of the acceptable error. This can be either the acceptable error for one type of data in the final force field or the expected average error in a group of input data. Say, for example, that it is sufficient that the final force field reproduces bond

 

˚

 

 

 

 

lengths to within 0.01 A, angles to 0.5°, and torsions to 1°. Suitable weight factors

˚

1

1

, and 1 degree

1

 

would then be 100 A

, 2 degree

 

, respectively. If a low-quality

 

 

 

˚

 

structure with bond-length errors around 0.02 A is included in the refinement,

 

 

 

 

˚ 1

.

bond lengths in that particular structure could be given a lower weight, 50 A

Weights for other types of data have been exemplified in the literature (11,15,19,20,28,44).

The balance between different types of data may be modified by further adjustment of the weight factors. In schemes employing QM-calculated energy derivatives, an extreme number of data points can be obtained with little effort.

*In earlier literature, the term merit function has been used (cf. Ref. 15). But because an increased value corresponds to a worse force field, penalty function is more appropriate.

Recipe for an Organometallic Force Field

21

It is then recommended that the weight of such data be reduced to avoid swamping the remaining reference data by sheer numbers. On the other hand, electrostatic parameters have a strong influence on conformational energies and may therefore be unduly adjusted by the automatic procedure in lieu of other, more relevant parameters. It can therefore be prudent to increase the weight of true electrostatic data (such as QM charges), especially in the initial stages of the refinement.

3.2. Initial Parameter Estimates

Setting up the initial force field is still largely a manual task. In particular when the reference set contains properties of energy minima, it is important that structures be reasonably accurate already before the parameter refinement is initiated. To achieve this, bond and angle ideal values and nonbonded parameters must be well estimated, whereas force constants, most torsional parameters, and crossterms can be entrusted to the automated refinement. Initial values must be set also for these, but it may be sufficient to use ‘‘similar’’ values from the existing force field. It is usually best to err on the high side with force constants, to minimize deviations from the reference values, and on the low side with torsional parameters and cross-terms, to avoid introduction of physically unrealistic distortions.

Electrostatic parameters can be set directly from QM-calculated charges. With some force fields, the charges are fixed to QM values at the outset and not refined further (34). Other nonbonded parameters (vdW constants) are not easily elucidated directly from any type of input data. However, parameters for most of the periodic table are available in the literature (10,46). Different force fields do not use the same absolute parameter values, but the scales usually correlate. Thus, it is possible to fit the existing parameters in the force field of interest to any complete set and to obtain the missing parameters from the correlation.

Initial ideal bond lengths and angles can be obtained from averages of observed values in the reference data. However, if strained structures are included in the reference set, an improved procedure is available. After all the parameters have been given initial values, calculated bond lengths of one type can be correlated with the corresponding reference values. It may be postulated that for observed structures with small distortions, the ‘‘real’’ bond energy will follow the Hooke’s law expression in Figure 1 reasonably well. It can also be assumed that for small parameter changes, the force from the surrounding structure acting upon one bond is constant. For one bond, the calculated force should thus equal the real force (Fig. 6). If the observed bond length lobs is plotted against the calculated deviation from the estimated ideal length (lcalc l0,est), a better estimate of the ideal bond length is obtained as the intercept, and an improved force constant can be obtained by dividing the initial estimate with the regression slope. Note

22

Norrby

FIGURE 6 Regression of real and calculated bond lengths.

that the ideal length will always be improved by this procedure, but the force constant may not; the new value should be accepted only if it is physically reasonable and improves the fit. The equations in Figure 6 can easily be extended to more complicated bond functions, but higher terms may be ignored if the deviations are small. The procedure can be applied iteratively, but not too far, because all data except the observed bond lengths are ignored. Final parameters should be obtained by optimizing the full penalty function (Fig. 5).

The most important assumption in Figure 6 is that the forces in the surrounding structure will be unaffected by parameter changes for the bond under observation. This is true only if the same parameters are not used in proximal bonds. For example, the assumption may break down if two bonds of the same type are present in a small ring. For this reason, the procedure is also less useful for angles, which are frequently redundant and linearly dependent (vide infra).

As a simple example, the procedure would fail completely for the H–C–H angle in methane, where the calculated bond angle will be 109.471° for all reasonable

(and many unreasonable) ideal bond angles. In situations like this, the ideal bond angle should simply be set to the observed average.

Exact torsional parameters are important for conformational energies but frequently not for gross structural agreement. If the v2 term for conjugated bonds is set to any large value, the remaining torsional parameters can usually be zeroed or set to values of ‘‘similar’’ torsions in the initial force field. However, this rule has many exceptions. When torsional parameters are important, reference data for the entire range of the rotational profile should be included. To avoid mixing with other force field terms, it is favorable to parameterize torsional parameters using QM data for rigid scans (47). In addition to the major advantage of avoiding mixing with other force field parameters, the QM calculations for rigid scans are also substantially cheaper than relaxed scans.

3.3.Refining Parameters

With the initial parameter set available, all data points can be calculated and compared to the reference data with suitable weighting (Fig. 5). The problem is then simply to vary the parameters in such a way that the penalty function decreases to a minimum. This is a very common task in all types of model develop-

Recipe for an Organometallic Force Field

23

ment, and many numerical procedures are available (43). Here, we will focus on a joint application of two complementary techniques, Simplex and Newton– Raphson optimizations (15).

Simplex

The simplex optimization is a very simple and robust technique for optimizing any function of a moderate number of variables. Only the function values for different variable sets are needed. In this case, the function to be optimized is the penalty function, and the variables to vary are the force field parameters. To initialize a simplex optimization of N parameters, one must first select N 1 linearly independent trial sets. A very simple way to achieve this is to start with the initial parameter estimate and then to vary each parameter in turn by a small amount, yielding N new trial sets. This is illustrated for a two-parameter case in Fig. 7. With two parameters, the shape of the simplex is a triangle, with three parameters a tetrahedron, and so on.

The penalty function is evaluated for all sets. The worst point is then selected and reflected through the centroid of the remaining points, yielding a new simplex. If the new point yields an even better result than the previous best point, an expansion is attempted. The expansion is accepted only if the result is better than for the simple reflection. If, on the other hand, the new point would be the worst in the new simplex, one of two possible contraction points is selected instead. Note that the contraction must always be accepted; if not, the simplex will just oscillate between two bad points.

FIGURE 7 Simplex optimization of two parameters.

24

Norrby

The simplex optimizations only run for a specified number of cycles, typically 10N to 30N. If the same point is best for 3N cycles, the optimization is also terminated. The method is very robust, and boundary conditions for parameters are easily implemented (for example, ideal bond lengths should always be positive). However, convergence is slow when too many parameters are included. As a rule of thumb, no more than 10 parameters should be included in a standard simplex optimization, but a recently introduced biasing procedure where the inversion point is offset toward the best points can make the method competitive for up to 30–40 parameters (15).

Newton–Raphson

The Newton–Raphson method is a very efficient method for finding roots of well-behaved functions (43). A step for one variable is obtained by dividing the function value by the gradient. Finding minima is equivalent to finding points where the first derivative of the function with respect to the variable is zero (the root of the gradient). The distance to this point from the initial variable value can then be estimated by dividing the first derivative by the second. However, convergence can become problematic when the second derivative goes to zero or becomes negative.

The multidimensional version of the Newton–Raphson minimization is employed for functions of many variables. The matrix of second derivatives is inverted and multiplied by the first derivatives to obtain the optimum step for all parameters (Fig. 8). Again, convergence can be problematic if the curvature is negative or close to zero. Note that the matrix is positive definite if the approximate form of the second derivative is used (last equation, Fig. 5). Thus, only eigenvalues close to zero can give problems. However, parameters are frequently

FIGURE 8 Multidimensional Newton–Raphson.

Recipe for an Organometallic Force Field

25

interdependent, and therefore singularities are common.* The effect of a low eigenvalue is a very long step, possibly outside of the region where the quadratic approximation is valid. A very simple fix is to introduce Lagrange multipliers, in effect increasing the curvature by adding a constant to each diagonal element (15). Another method is to follow only search directions with a strong curvature, employing singular value decomposition (SVD) (15). Each of these methods has the undesired effect that parameters with a weak curvature are not optimized.

The Newton–Raphson method requires differentiation of all data points with respect to the parameters. For fixed-geometry properties (like energy derivatives), the force field derivatives can be obtained analytically (32). For other types of properties, an approximate analytical solution can be obtained by assuming that the shift in geometry is small upon parameter change (45). However, the most general and safest method is to obtain the derivatives numerically (15). The drawback is that the method is substantially slower than calculating analytical derivatives.

Alternating Between Methods

The Newton–Raphson method shows good convergence for parameters that display a strong penalty function curvature and are not too strongly interdependent. However, there are usually some parameters that will not be well converged by the method. This problem has been alleviated in some recent parameterization efforts (20,44) by alternating between optimization methods. In numerical schemes, the absolute second derivative of the penalty function with respect to each parameter is available from direct differentiation. It is assumed that a parameter will be badly determined by a Newton–Raphson step if this value is very low or negative. The 10–20 worst parameters are selected and subjected to a separate simplex optimization.

A complete automated refinement cycle is detailed shortly. It is assumed that the penalty function can be determined for each parameter set by automatic calculation of all data points. Further required input is a list of parameters to be refined and a numerical differentiation step size for each parameter. The number of parameters to be refined is denoted N. The value for data point i calculated with parameter set k is shown as yi (pk), and the total penalty function for the same parameter set is χ2(pk). The initial parameter set is denoted p0, whereas a parameter set where parameter j has been differentiated is shown as (p0 δpj).

1.Create 2N 1 trial parameter sets for central differentiation, by subtracting and adding a numerical differentiation step δpj to each parameter in turn. Calculate all data points for all trial parameter sets.

*In the author’s experience, the curvature matrix will always become singular, at least at some points in the parameter refinement.

26

Norrby

FIGURE 9 Numerical differentiation.

2.Calculate data point derivatives by central differentiation as shown in

Figure 9. The numerical derivatives are then used to calculate the penalty function derivatives according to Figure 5 (y/p dy/dp). Sev-

eral new trial parameter sets are calculated from the last equation in Figure 8, using different Lagrange multipliers γ. Additional trial sets can be obtained from SVD solutions, by varying the threshold for acceptable singular values (15). Very small steps are discarded, whereas very long steps can be either discarded or reduced to an acceptable size (a trust radius). The penalty function is calculated for all trial parameter sets, the best is selected, and all others are discarded.

3.Using data from step 1, the maximum variation in penalty function in response to each parameter is calculated from χ2(p0 δpj), χ2(p0), and

χ2(p0 δpj). This value is used to balance the differentiation steps for the next iteration. As an example using arbitrary limits, if the variation is less than 1, δpj is doubled, whereas if it is larger than 100, δpj is halved.

4.Calculate the absolute derivatives of the penalty function with respect to each parameter by numerical differentiation as shown in Figure 9. Any parameters that result in a negative second derivative and as many as possible of those where the second derivative is small compared to the first (up to a maximum of 20–40 parameters) are selected for simplex optimization. The starting simplex is derived from the best parameter set (in step 2) by shifting each parameter in turn, using the updated step lengths from step 3.

5.The best parameter set after the simplex is compared to the initial parameter set. If improvement is lower than 0.1%, the refinement cycle is terminated.

3.4.Frequently Encountered Problems

The initially estimated force field will usually give very large errors for some data points. The automatic procedure will respond by large parameter changes, but not always in parameters that a chemist would consider natural. For example, energy second derivatives (the Hessian) are usually connected with force con-

Recipe for an Organometallic Force Field

27

stants, but for any force field employing more complex bonding terms than the simple Hooke’s law expression in Figure 1, Hessian elements are also affected by ideal bond lengths. Thus, large errors in the Hessian, which would naturally be corrected only by modification of force constants, might in an automatic procedure result in distorted ideal bond lengths. Many other types of parameters, in particular electrostatic parameters and cross terms, are sensitive to this type of ‘‘unnatural’’ correction.

Erroneous data may give strange effects in automated parameterization schemes. Since all deviations are squared, a single large error may totally dominate the refinement. For example, extreme bond-length shortenings of the type illustrated in Figure 4 are quite common in crystal structures, especially if the crystallographer has failed to take notice of cocrystallizing rotameric forms. Such errors must be identified and removed from the data set. Some low-quality data may have to be included in order to define all parameters, but should then be given low weight factors. It is also important that any errors be small and randomly distributed.

For metal complexes, specific problems may also arise from the coordination model. Angles around the metal are frequently soft, so geometries are easily distorted. Small parameter changes may lead to large distortions and sometimes to qualitatively wrong coordinations. In the initial stages, it may be safest to assign specific, relatively stiff angle interactions. Any scheme that dynamically updates the parameter values in the energy minimization is hazardous in a parameter refinement. If the chosen model uses no angle parameters, it may even be necessary to use weak restraints on the atomic coordinates to put a limit on the maximum error and avoiding falling into an erroneous geometry. If no precautions of this type can be taken, it is particularly important that each iteration start from one set of starting geometries, not the resulting geometries of the previous iteration.

Most parameterization problems arise because the parameters are not uniquely defined by the data. Molecular mechanics parameters are to some extent redundant and will therefore frequently show linear dependencies in the refinement. Ideal bond angles are good examples of this. Compare, for example, the simple molecules water and methane. The H–O–H ideal angle will be well defined if a water structure is included in the parameterization. Any change in the ideal bond angle will be immediately reflected in the calculated structure. For

methane, on the other hand, any ideal bond angle larger than the standard tetrahedral angle of 109.471° will give a perfectly tetrahedral structure. Say, for example, that the ideal angle is set to the chemically unreasonable value of 130°. The

structures will be strained, but strain does not cause any increase in the penalty function, for the sum of forces on all atoms will still be zero. Vibrations will be affected, but a lowering of the force constant will have the same effect as lowering the ideal angle. In a more realistic parameterization also including ethane,

Соседние файлы в предмете Химия