Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Young - Computational chemistry

.pdf
Скачиваний:
77
Добавлен:
08.01.2014
Размер:
4.23 Mб
Скачать

236 28 BASIS SET CUSTOMIZATION

N 0

 

 

S

4 1. 00

 

6.7871900000E+02

1.7250000000E-02

1.0226600000E+02

1.2052000000E-01

2.2906600000E+01

4.2594000000E-01

6.1064900000E+00

5.4719000000E-01

S

2 1. 00

 

8.3954000000E-01 4.7481000000E-01

2.5953000000E-01 6.4243000000E-01

S

1 1 . 00

 

8.0200000000E-02

1.0000000000E+00

P

2 1. 00

 

2.3379500000E+00

3.3289000000E-01

4.1543000000E-01 7.7381000000E-01

P

1 1 . 00

 

7.3880000000E-02

1.0000000000E+00

D

1 1. 00

 

8.0000000000E-01

1.0000000000E+00

****

 

 

 

FIGURE 28.4

Final basis.

Note that the answers have been rounded to three signi®cant digits. Since the even-tempered formula is only an approximation, this does not introduce any signi®cant additional error.

Although the even tempered function scheme is fairly reasonable far from the nucleus, each function added is slightly further from the energy-optimized value. Generally, two or three additional functions at the most will be added to a basis set. Beyond this point, it is most e½cient to switch to a di¨erent, larger basis.

A di¨erent scheme must be used for determining polarization functions and very di¨use functions (Rydberg functions). It is reasonable to use functions from another basis set for the same element. Another option is to use functions that will depict the electron density distribution at the desired distance from the nucleus as described above.

Having polarization functions of higher angular momentum than the highest occupied orbitals is usually the most polarization that will bene®t HF or DFT results. Higher-angular-momentum functions are important for very-high- accuracy con®guration interaction and coupled-cluster calculations. As a general rule of thumb, uncontracting a valence primitive generally lowers the variational energy by about as much as adding a set of polarization functions.

A new basis for the element can now be created by combining these techniques. The basis in Figure 28.4 was created from the contracted set illustrated in Figure 28.3. Additional even-tempered exponents have been added to both the s and p functions. A polarization function of d symmetry was obtained from the 6ÿ31G(d) basis set. In a realistic scenario, a certain amount of trial-and- error work, based on obtaining low variational energies and stronger chemical bonds, would be involved in this process. This nitrogen example is somewhat arti®cial because there are many high-quality basis functions available for nitrogen that would be preferable to customizing a basis set.

28.5 BASIS SET SUPERPOSITION ERROR 237

The ®nal step is to check the performance of the basis set. This can be done by ®rst doing a single-atom calculation to check the energy and virial theorem value. The UHF calculation for this basis gave a virial theorem check of ÿ1.9802, which is in reasonable agreement with the correct value of ÿ2. The UHF atom energy is ÿ54.10814 Hartrees for this example. This is really not a very good total energy for nitrogen due to the fact that the example started with a fairly small basis set. The 6ÿ31G(d) basis gives a total energy of ÿ54.38544 Hartrees for nitrogen. The basis in this example should probably not be extended any more than has been done here, since it would lead to having a disproportionately well-described valence region and poorly described core.

The ®nal test of the basis quality, particularly in the valence region, is the result of molecular calculations. This basis gave an N2 bond length of 1.1409 AÊ at the HF level of theory and 1.1870 at the CCSD level of theory, in only moderate agreement with the experimental value of 1.0975 AÊ . The larger 6ÿ31G(d) basis set gives a bond length of 1.0783 AÊ at the HF level of theory. The experimental bond energy for N2 is 225.9 kcal/mol. The HF calculation with this example basis yields 89.9 kcal/mol, compared to the HF 6ÿ31G(d) bond energy of 108.6 kcal/mol. At the CCSD level of theory, the sample basis gives a bond energy of 170.3 kcal/mol.

28.5BASIS SET SUPERPOSITION ERROR

Basis set superposition error (BSSE) is an energy lowering of a complex of two molecules with respect to the sum of the individual molecule energies. This results in obtaining van der Waals and hydrogen bond energies that are too large because the basis functions on one molecule act to describe the electron density of the other molecule. In the limit of an exact basis set, there would be no superposition error. The error is also small for minimal basis sets, which do not have functions di¨use enough to describe an adjacent atom. The largest errors occurred when using moderate-size basis sets.

The procedure for correcting for BSSE is called a counterpoise correction. In this procedure, the complex of molecules is ®rst computed. The individual molecule calculations are then performed using all the basis functions from the complex. For this purpose, many ab initio software programs contain a mechanism for de®ning basis functions that are centered at a location which is not on one of the nuclei. The interaction energy is expressed as the energy for the complex minus the individual molecule energies computed in this way. In equation form, this is given as

Einteraction ˆ EABAB† ÿ EABA† ÿ EABB

…28:8†

where the subscripts denote the basis functions being used and the letters in parentheses denote the molecules included in each calculation.

Counterpoise correction should, in theory, be unnecessary for large basis

238 28 BASIS SET CUSTOMIZATION

sets. However, practical applications have shown that it yields a signi®cant improvement in results even for very large basis sets. The use of a counterpoise correction is recommended for the accurate computation of molecular interaction energies by ab initio methods.

BIBLIOGRAPHY

Text books containing detailed basis set discussions are

J.B. Foresman, A. Frisch, Exploring Chemistry with Electronic Structure Methods Second Edition Gaussian, Pittsburgh (1996).

A. R. Leach Molecular Modelling Principles and Applications Longman, Essex (1996). I. N. Levine, Quantum Chemistry Fourth Edition Prentice Hall, Englewood Cli¨s (1991). W. J. Hehre, L. Radom, P. v. R. Schleyer, J. A. Pople, Ab Initio Molecular Orbital

Theory John Wiley & Sons, New York (1986).

Review articles and more detailed sources are

T. H. Dunning, K. A. Peterson, D. E. Woon, Encycl. Comput. Chem. 1, 88 (1998).

F.B. van Duijneveldt, J. G. C. M. van Duijneveldt-van de Rijdt, J. H. van Lenthe, Chem. Rev. 94, 1873 (1994).

A. D. Buckingham, P. W. Fowler, J. M. Hutson, Chem. Rev. 88, 963 (1988).

S.Wilson, Ab Initio Methods in Quantum Chemistry-I 439 K. P. Lawley, Ed., John Wiley & Sons, New York (1987).

E. R. Davidson, D. Feller, Chem. Rev. 86, 681 (1986).

J.Andzelm, M. Kobukowski, E. Radzio-Andzelm, Y. Sakai, H. Tatewaki, Gaussian Basis Sets for Molecular Calculations S. Huzinaga, Ed., Elsevier, Amsterdam (1984).

Basis set superposition error is reviewed in

N. R. Kestner, Rev. Comput. Chem. 13, 99 (1999).

F.B. van Duijneveldt, Molecular Interactions S. Scheiner, Ed., 81, John Wiley & Sons, New York (1997).

Computational Chemistry: A Practical Guide for Applying Techniques to Real-World Problems. David C. Young Copyright ( 2001 John Wiley & Sons, Inc.

ISBNs: 0-471-33368-9 (Hardback); 0-471-22065-5 (Electronic)

29 Force Field Customization

It is occasionally desirable to add new parameters to a molecular mechanics force ®eld. This might mean adding an element that is not in the parameterization set or correctly describing a particular atom in a speci®c class of molecules.

29.1POTENTIAL PITFALLS

It is tempting to take parameters from some other force ®eld. However, unlike ab initio basis sets, this is not generally a viable method. Force ®elds are set up with di¨erent lists of energy terms. For example, one force ®eld might use stretch, bend, and stretch±bend terms, whereas another uses stretch and bend terms only. Using the stretch and bend parameters from the ®rst without the accompanying stretch±bend term would result in incorrectly describing both bond stretching and bending.

From one force ®eld to the next, the balance of energy terms may be di¨erent. For example, one force ®eld might use a strong van der Waals potential and no electrostatic interaction, while another force ®eld uses a weaker van der Waals potential plus a charge term. Even when the same terms are present, di¨erent charge-assignment algorithms yield systematic di¨erences in results and the van der Waals term may be di¨erent to account for this.

When the same energy terms are used in two force ®elds, it may be acceptable to transfer bond-stretching and angle-bending terms. These are fairly sti¨ motions that do not change excessively. The force constants for these terms vary between force ®elds, much more than the unstrained lengths and angles.

Transferring torsional and nonbonded terms between force ®elds is much less reliable. These are lower-energy terms that are much more interdependent. It is quite common to ®nd force ®elds with signi®cantly di¨erent parameters for these contributions, even when the exact same equations are used.

Atoms with unusual hybridizations can be particularly di½cult to include. Most organic force ®elds describe atoms with hybridizations whose bond angles are all equivalent (i.e., sp, sp2, and sp3 hybridizations with bond angles of 180, 120, and 109.5 , respectively). In contrast to this, a square planar atom will have some bond angles of 90 and some angles of 180 . In this case, it may be necessary to de®ne the bond and angle terms manually, modify the software, or hold the bond angles ®xed in the calculation.

239

240 29 FORCE FIELD CUSTOMIZATION

29.2ORIGINAL PARAMETERIZATION

Understanding how the force ®eld was originally parameterized will aid in knowing how to create new parameters consistent with that force ®eld. The original parameterization of a force ®eld is, in essence, a massive curve ®t of many parameters from di¨erent compounds in order to obtain the lowest standard deviation between computed and experimental results for the entire set of molecules. In some simple cases, this is done by using the average of the values from the experimental results. More often, this is a very complex iterative process.

The ®rst step in creating a force ®eld is to decide which energy terms will be used. This determines, to some extent, the ability of the force ®eld to predict various types of chemistry. This also determines how di½cult the parameterization will be. For example, more information is needed to parameterize anharmonic bond-stretching terms than to parameterize harmonic terms.

The parameters in the original parameterization are adjusted in order to reproduce the correct results. These results are generally molecular geometries and energy di¨erences. They may be obtained from various types of experimental results or ab initio calculations. The sources of these ``correct'' results can also be a source of error. Ab initio results are only correct to some degree of accuracy. Likewise, crystal structures are in¯uenced by crystal-packing forces.

Many parameterizations are merely a massive ®tting procedure to determine which parameters will best reproduce these results. This procedure may be done with automated software or through the work and understanding of the designers. Often, a combination of both gives the best results. In recent years, global search techniques, such as genetic algorithms, have been used. This is usually an iterative procedure as parameters are adjusted and results computed for the test set of molecules.

A second procedure is called a rule-based parameterization. This is a way of using some simple relationship to predict a large number of parameters. For example, bond lengths might be determined as the geometric mean of covalent bond radii multiplied by a correction factor. In this case, determining one correction factor is tantamount to determining all needed bond lengths. This procedure has the advantage of being able to create a force ®eld describing a large variety of compounds. The disadvantage is that the accuracy of results for a speci®c compound is not usually as good as that obtained with a force ®eld parameterized speci®cally for that class of compounds.

29.3ADDING NEW PARAMETERS

A measure of sophistication is necessary in order to obtain a reasonable set of parameters. The following steps are recommended in order to address the concerns above. They are ranked approximately best to worst, but it is advisable to use all techniques for the sake of doublechecking your work. Step 9 should

BIBLIOGRAPHY 241

always be included in the process. There are utilities available to help ease the amount of work involved, but even with these the researcher should still pay close attention to the steps being taken.

1.Examine the literature describing the original parameterization of the force ®eld being used. Following this procedure as much as possible is advisable. This literature also gives insights into the strengths and limitations of a given force ®eld.

2.Find articles describing how new parameters were added to the exact same force ®eld. The procedure will probably be similar for your case.

3.If the atom being added has an unusual hybridization, examine the literature in which parameters were derived for that same hybridization.

4.If considering transferring parameters from one force ®eld to another, examine the parameters for an atom that is in both force ®elds already. If the two sets of parameters are not fairly similar, do not use parameters from that force ®eld.

5.First determine what parameters will be used for describing bond lengths and angles. Then determine torsional, inversion, and nonbonded interaction parameters.

6.Try using obvious values for the parameters, such as bond lengths directly from crystal structures. This assumes that no interdependence exists between parameters, but it is a starting point.

7.Use values from ab initio calculations.

8.Look for a very similar atom that has been parameterized for the force ®eld and trying scaling its parameters by a suitable correction factor. Even if one of the steps above was used, this provides a quick check on the reasonableness of your parameterization.

9.Run test calculations with the new parameters. Then adjust the parameters as necessary to reproduce experimental results before using them to describe an unknown compound.

BIBLIOGRAPHY

Books discussing force ®eld parameterization are

MacroModel Technical Manual SchroÈdinger, Portland OR (1999).

M.F. Schlecht, Molecular Modeling on the PC Wiley-VCH, New York (1998).

A.K. RappeÂ, C. J. Casewit, Molecular Mechanics across Chemistry University Science Books, Sausalito (1997).

A.R. Leach Molecular Modelling Principles and Applications Longman, Essex (1996).

P.Comba, T. W. Hambley, Molecular Modeling of Inorganic Compounds VCH, Weinheim (1995).

G. H. Grant, W. G. Richards, Computational Chemistry Oxford, Oxford (1995).

242 29 FORCE FIELD CUSTOMIZATION

U. Burkert, N. L. Allinger, Molecular Mechanics American Chemical Society, Washington (1982).

Some journal articles with general discussions are

M. Zimmer, Chem. Rev. 95, 2629 (1995).

B. P. Hay, Coord. Chem. Rev. 126, 177 (1993).

J. P. Bowen, N. L. Allinger, Rev. Comput. Chem. 2, 81 (1991).

J. R. Maple, U. Dinur, A. T. Hagler, Proc. Natl. Acad. Sci. USA 85, 5350 (1988). A. J. Hop®nger, R. A. Pearlstein, J. Comput. Chem. 5, 486 (1984).

A comprehensive listing of all published force ®eld parameters is

M. Jalaie, K. B. Lipkowitz, Rev. Comput. Chem. 14, 441 (2000).

E. Osawa, K. B. Lipkowitz, Rev. Comput. Chem. 6, 355 (1995).

Computational Chemistry: A Practical Guide for Applying Techniques to Real-World Problems. David C. Young Copyright ( 2001 John Wiley & Sons, Inc.

ISBNs: 0-471-33368-9 (Hardback); 0-471-22065-5 (Electronic)

30 Structure±Property Relationships

Structure±property relationships are qualitative or quantitative empirically de®ned relationships between molecular structure and observed properties. In some cases, this may seem to duplicate statistical mechanical or quantum mechanical results. However, structure-property relationships need not be based on any rigorous theoretical principles.

The simplest case of structure-property relationships are qualitative rules of thumb. For example, the statement that branched polymers are generally more biodegradable than straight-chain polymers is a qualitative structure±property relationship.

When structure-property relationships are mentioned in the current literature, it usually implies a quantitative mathematical relationship. Such relationships are most often derived by using curve-®tting software to ®nd the linear combination of molecular properties that best predicts the property for a set of known compounds. This prediction equation can be used for either the interpolation or extrapolation of test set results. Interpolation is usually more accurate than extrapolation.

When the property being described is a physical property, such as the boiling point, this is referred to as a quantitative structure±property relationship (QSPR). When the property being described is a type of biological activity, such as drug activity, this is referred to as a quantitative structure±activity relationship (QSAR). Our discussion will ®rst address QSPR. All the points covered in the QSPR section are also applicable to QSAR, which is discussed next.

30.1QSPR

The ®rst step in developing a QSPR equation is to compile a list of compounds for which the experimentally determined property is known. Ideally, this list should be very large. Often, thousands of compounds are used in a QSPR study. If there are fewer compounds on the list than parameters to be ®tted in the equation, then the curve ®t will fail. If the same number exists for both, then an exact ®t will be obtained. This exact ®t is misleading because it ®ts the equation to all the anomalies in the data, it does not necessarily re¯ect all the correct trends necessary for a predictive method. In order to ensure that the method will be predictive, there should ideally be 10 times as many test compounds as ®tted parameters. The choice of compounds is also important. For

243

244 30 STRUCTURE±PROPERTY RELATIONSHIPS

example, if the equation is only ®tted with hydrocarbon data, it will only be reliable for predicting hydrocarbon properties.

The next step is to obtain geometries for the molecules. Crystal structure geometries can be used; however, it is better to use theoretically optimized geometries. By using the theoretical geometries, any systematic errors in the computation will cancel out. Furthermore, the method will predict as yet unsynthesized compounds using theoretical geometries. Some of the simpler methods require connectivity only.

Molecular descriptors must then be computed. Any numerical value that describes the molecule could be used. Many descriptors are obtained from molecular mechanics or semiempirical calculations. Energies, population analysis, and vibrational frequency analysis with its associated thermodynamic quantities are often obtained this way. Ab initio results can be used reliably, but are often avoided due to the large amount of computation necessary. The largest percentage of descriptors are easily determined values, such as molecular weights, topological indexes, moments of inertia, and so on. Table 30.1 lists some of the descriptors that have been found to be useful in previous studies. These are discussed in more detail in the review articles listed in the bibliography.

Once the descriptors have been computed, is necessary to decide which ones will be used. This is usually done by computing correlation coe½cients. Correlation coe½cients are a measure of how closely two values (descriptor and property) are related to one another by a linear relationship. If a descriptor has a correlation coe½cient of 1, it describes the property exactly. A correlation coe½cient of zero means the descriptor has no relevance. The descriptors with the largest correlation coe½cients are used in the curve ®t to create a property prediction equation. There is no rigorous way to determine how large a correlation coe½cient is acceptable.

Intercorrelation coe½cients are then computed. These tell when one descriptor is redundant with another. Using redundant descriptors increases the amount of ®tting work to be done, does not improve the results, and results in unstable ®tting calculations that can fail completely (due to dividing by zero or some other mathematical error). Usually, the descriptor with the lowest correlation coe½cient is discarded from a pair of redundant descriptors.

A curve ®t is then done to create a linear equation, such as

Property ˆ c0 ‡ c1d1 ‡ c2d2

…30:1†

where ci are the ®tted parameters and di the descriptors. Most often, the equation being ®tted is a linear equation like the one above. This is because the use of correlation coe½cients and linear equations together is an easily automated process. Introductory descriptions cite linear regression as the algorithm for determining coe½cients of best ®t, but the mathematically equivalent matrix leastsquares method is actually more e½cient and easier to implement. Occasionally, a nonlinear parameter, such as the square root or log of a quantity, is used. This is done when a researcher is aware of such nonlinear relationships in advance.

30.1 QSPR 245

TABLE 30.1 Common Molecular Descriptors

Constitutional Descriptors

Molecular weight

Number of atoms of various elements

Number of bonds of various orders

Number of rings

Topological Descriptors

Weiner index

Randic indices

Kier and Hall indices

Information content

Connectivity index

Balaban index

Electrostatic Descriptors

Partial charges

Polarity indices

Topological electronic index

Multipoles

Charged partial surface areas

Polarizability

Anisotropy of polarizability

Geometrical Descriptors

Moments of inertia

Molecular volume

Molecular surface areas

Shadow indices

Taft steric constant

Length, width, and height parameters

Shape factor

Quantum Chemical Descriptors

Net atomic charges

Bond orders

HOMO and LUMO energies

FMO reactivity indices

Refractivity

Total energy

Ionization potential

Electron a½nity

Energy of protonation

Orbital populations

Frontier orbital densities

Superdelocalizabilities

Соседние файлы в предмете Химия