Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Rogers Computational Chemistry Using the PC

.pdf
Скачиваний:
78
Добавлен:
15.08.2013
Размер:
4.65 Mб
Скачать

CURVE FITTING

 

 

73

SLOPE¼1.925152 , Y INTERCEPT¼2.100666

s

 

p22

¼

:

 

sb ¼

d11s2

0:194

 

m ¼ p

¼

 

 

 

 

d s2

 

0 078

TableCurve gives

 

 

 

TableCurve Output 3-5

 

 

 

 

Rank 3 Eqn 1 y ¼ a þ bx

 

 

 

 

r2 Coef Det

DF Adj r2

Fit Std Err

F-value

 

 

0.9869616170

0.9832363647

0.2842291542

605.57301820

 

 

Parm Value

Std Error

t-value

95% Confidence Limits

 

a

2.100666667

0.194165477

10.81895043

1.651357072

2.549976262

b

1.925151515

0.078231500

24.60839325

1.744119520

2.106183510

The result of our analysis is that the data are fit by the straight line

 

y ¼ 2:10 0:19 þ 1:93 0:08 x

 

 

ð3-36Þ

Note that there are two equations with a higher rank than y ¼ a þ bx. They are the exponential and power equations y ¼ a þ b expð x=cÞ and y ¼ a þ bxc. There is little to choose among the r2 values of these three curve fits, 0.9871, 0.9871, and 0.9870, so we choose the simplest equation. Small differences in r2 values should not be counted too heavily, and we should be wary of r2 values that look impressive. Note that Fig. 3-2 shows substantial scatter despite the high r2 value. Use common sense and an aesthetic of simplicity in choosing the best curve fit. As usual, the number of significant figures in the final equation (3-36) is determined by the number of significant figures going into the calculation (Fig. 3-2).

COMPUTER PROJECT 3-1 j Linear Curve Fitting: KF Solvation

Linear extrapolation of the experimental behavior of a real gas to zero pressure or a solute to infinite dilution is often used as a technique to ‘‘get rid’’ of molecular or ionic interactions so as to study some property of the molecule or ion to which these interferences are considered extraneous. Emsley (1971) studied the heat (enthalpy) of solutions of potassium fluoride KF and the monosolvated species KF HOAc in glacial acetic acid at several concentrations. A known weight of the anhydrous salt KF was added to a known weight of glacial acetic acid in a Dewar flask fitted with a heating coil, a stirrer, and a sensitive thermometer. The temperature change on each addition was recorded. The heat capacity C of the flask and its contents was determined by supplying a known amount of electrical energy Q to the flask and noting the temperature rise T in kelvins (K)

Qð joulesÞ ¼ C T

ð3-37Þ

The experiment was repeated for the solvated salt KF HOAc, where the molecule of solvation is acetic acid, HOAc. Some experimental results calculated from the original paper are shown in Table 3-2:

74

COMPUTATIONAL CHEMISTRY USING THE PC

Table 3-2 Enthalpies of Solution sol’n H298 of KF and KF HOAc in Glacial

Acetic Acid at 298 K

 

 

 

 

 

 

 

 

 

KF: C ¼ 4:168 kJ K 1

 

 

 

 

Molality

0.194

0.590

0.821

1.208

Temperature change, K

1.592

4.501

5.909

8.115

KF HOAc: C ¼ 4:203 kJ K 1

 

 

 

 

Molality

0.280

0.504

0.910

1.190

Temperature Change, K

0.227

0.432

0.866

1.189

Procedure. Calculate the heats of solution of the two species, KF and KF HOAc, at each of the four given molalities from a knowledge of the heat capacity. Calculate the enthalpy of solution per mole of solute sol’nH298 at each concentration. Find the least squares curve fit and calculate the uncertainties for the function

sol’n H298 ¼ sol’n H0298 sol’nH0298 þðSLOPEÞM SLOPE

ð3-38Þ

where M is the molality and sol’n H0298 is the enthalpy of solution at infinite dilution. Do this for each species, the anhydrous and the solvated fluoride. Record the slopes and intercepts of both functions, SLOPE1 for KF and SLOPE2 for

KF HOAc. Record the enthalpy changes and uncertainties, sol’n H0298 sol’nH0298 (KF) and sol’n H0298 sol’nH0298 (KF HOAc). What are the units of SLOPE?

Read the article on the original research (Emsley, 1971) and include a commentary on these results in your report for this experiment. Emsley claims that the enthalpy of solution

KFðsÞ þ HOAcðlÞ ! KF OAcHðsol’nÞ

is sol’n H0298 ¼ 38:5 kJ mol 1. What argument does he present to support this claim?

COMPUTER PROJECT 3-2 j The Boltzmann Constant

An interesting historical application of the Boltzmann equation involves examination of the number density of very small spherical globules of latex suspended in water. The particles are distributed in the potential gradient of the gravitational field. If an arbitrary point in the suspension is selected, the number of particles N at height h mm (1 mm ¼ 10 6 m) above the reference point can be counted with a magnifying lens. In one series of measurements, the number of particles per unit volume of the suspension as a function of h was as shown in Table 3-3.

The Boltzmann distribution gives

mgh

ð3-39Þ

N ¼ N0e kT

CURVE FITTING

 

 

 

 

 

75

Table 3-3 The Number of Latex Particles at Various Heights in mm Above a

 

 

Reference Point

 

 

 

 

 

 

 

 

 

 

 

 

 

h=mm

0

50

70

90

100

150

200

N

977

453

293

219

176

69

28

 

 

 

 

 

 

 

 

at constant temperature T, where m is the effective mass of the particle corrected for the buoyancy of the supporting medium.

Statistical analysis of the data set is best done by ‘‘linearizing’’ the function (Jurs, 1996), that is, by transforming it to a straight line of the form y ¼ a þ bx. In the case of the Boltzmann distribution, because y ¼ aebx leads to ln y ¼ ln a þ bx, we can take logarithms of both sides,

ln N ¼ ln N0

mgh

ð3-40Þ

kT

whereupon the slope of ln N vs. h is mghkT , where g is the acceleration due to the gravitational field and k is a universal constant now called Boltzmann’s constant and denoted kB.

The supporting medium was water at 298 K ðr ¼ 0:99727Þ, and the density of latex is 1.2049 g cm 3. The latex particles had an average radius of 2:12 10 4 mm; hence, their effective mass corrected for buoyancy is their volume 43p r3 times the density difference r between latex and the supporting medium, water

m ¼ nr ¼ ð4=3Þð2:12 10 7 3ð1:2049 0:99727Þ

This yields m ¼ 8:287 10 18 kg.

Procedure. Compute the slope of the function by a linear least squares procedure and obtain a value of Boltzmann’s constant. How many particles do you expect to find 125 mm above the reference point? Take the uncertainty you have calculated for the slope, as the uncertainty in kB. Is the modern value of kB ¼ 1:381 10 23 within these error limits?

Having determined kB and knowing that the gas constant R ¼ 8:314 J K 1 from macroscopic measurements on gases, determine Avogadro’s number L from the relationship

R ¼ kBL

ð3-41Þ

Calculate the % difference between L found by this method and the modern value of 6:022 1023. Does this support the idea that the Boltzmann constant is the gas constant per particle?

76

COMPUTATIONAL CHEMISTRY USING THE PC

COMPUTER PROJECT 3-3

j

The Ionization Energy of Hydrogen

The ionization energy for hydrogen is the minimum amount of energy that is required to bring about the reaction

H ! Hþ þ e

The ionization energy for hydrogen (or other hydrogen-like systems) can be found from the Rydberg equation

 

1

¼

 

1

1

 

 

 

n ¼ l

 

n12

n22

ð

 

Þ

 

 

 

R

 

 

 

3-42

 

along with an accurate set of spectral data. Eq. (3-42) leads us to believe that both the slope R and the intercept R of the linear function

 

¼

R

 

n22

 

ð

3-43

Þ

 

 

R 1

 

 

 

n

 

 

 

 

 

 

 

 

are equal to the ionization energy which has n22 ¼ 1.

Procedure. Use Mathcad, QLLSQ, or TableCurve (or, preferably, all three) to determine a value of the ionization energy of hydrogen from the wave numbers in Table 3-4 taken from spectroscopic studies of the Lyman series of the hydrogen spectrum where n1 ¼ 1.

Table 3-4 Spectral Wavenumbers v for the Lyman Series of Hydrogen

2

¼

2

3

4

5

6

7

n2

 

n2

¼ 4

9

16

25

36

49

v ¼ 82259

97492

102824

105292

106632

107440

Note that we are interested in n2, the atomic quantum number of the level to which the electron ‘‘jumps’’ in a spectroscopic excitation. Use the results of this data treatment to obtain a value of the Rydberg constant R. Compare the value you obtain with an accepted value. Quote the source of the accepted value you use for comparison in your report. What are the units of R? A conversion factor may be necessary to obtain unit consistency. Express your value for the ionization energy of H in units of hartrees (h), electron volts (eV), and kJ mol 1. We will need it later.

Reliability of Fitted Polynomial Parameters

The method of finding uncertainty limits for linear equations can be generalized to higher-order polynomials. The matrix method for finding the minimization

CURVE FITTING

77

parameters for the polynomial of next higher order (second order), y ¼ ax þ bx2 is

0

a

P xi

P

B Pxi2

P xi3

P

 

xi

xi2

P

@ P

P

xi3

10 b 1

¼

0 Pxiyi

1

ð3-44Þ

xi2

a

 

yi

C

 

xi4

C c

 

B Pxi2yi

 

 

A@ A

 

@ P

A

 

with the solution vector

Pxi2

P xi3

1

 

0 Pxiyi

1

ð3-45Þ

0 b 1

¼

0

xi

1

a

 

 

a

xi

xi2

 

yi

C

 

c

 

B Pxi2

P xi3

P xi4

C B Pxi2yi

 

@ A

 

@ P

P

P

A @ P

A

 

The quadratic curve fit leads to a number of residuals equal to the number of points in the data set. The sum of squares of residuals gives SSE by Eqs. (3-23) and MSE by Eq. (3-30), except that now the number of degrees of freedom for n points is n 3

SSE

ð3-46Þ

MSE ¼ n 3

p

MSE gives the standard deviation of the regression, s ¼ MSE.

The uncertainties of the minimization parameters are calculated just as they were for the linear case except that now there are three of them

sa ¼

p 22

 

3-47b

 

d11s2

ð3-47aÞ

 

b ¼ p

ð

 

Þ

s

 

d s2

 

 

 

and

c ¼ p

 

 

 

 

ð

 

Þ

s

 

d33s2

 

3-47c

 

In contrast to the linear case, there are three degrees of freedom, but there is still only one standard deviation of the regression, s. The reader has the opportunity to try out these ideas in Computer Project 3-4.

COMPUTER PROJECT 3-4 j The Partial Molal Volume of ZnCl2

In general, the volume of a solution, say ZnCl2 in water, is dependent on the number of moles ni of each of the components. For a binary solution,

V ¼ f ðn1; n2Þ

ð3-48Þ

The change in volume dV on adding a small amount dn1 of water or dn2 of ZnCl2 is

dV ¼

qn1 dn1

þ

qn2 dn2

ð3-49Þ

 

 

qV

 

 

qV

 

78

COMPUTATIONAL CHEMISTRY USING THE PC

where we stipulate that P and T are constant for the process and we adopt the usual subscript convention, 1 for solvent and 2 for solute. If we specify 1 kg as the amount of water, n2 is the molality of ZnCl2. We expect that the volume of the solution will be greater than 1000 cm3 by the volume taken up by the ZnCl2. It may seem reasonable to take the volume of one mole of ZnCl2 in the solid state Vm and add it to 1000 cm3 to get the volume of a 1 molal solution. One-half the molar volume of solute would, by this scheme, lead to the volume of a 0.5 molar solution and so on. This does not work. The volume of 1000 g of water in the solution is not exactly 1000 cm3, and it is dependent on the temperature. Nor are the volumes additive. Indeed, some solutes cause contraction of the solution to less than 1000 cm3.

Interactions at the molecular or ionic level cause an expansion or contraction of the solution so that, in general

V 6¼1000 þ Vm

We define a partial molar volume Vi such that

 

 

V ¼ n1V1

þ n2V2

for a binary solution or, in general,

 

N

 

X

V ¼

 

niVi

 

i¼1

for a solution of N components.

It can be shown (Alberty, 1987) that

¼ qV Vi

qni nj

ð3-50Þ

ð3-51Þ

ð3-52Þ

ð3-53Þ

where the subscript nj indicates that all components in the solution other than i are held constant. If the solution is a binary solution of n2 moles of solute in 1 kg of

water, V2 is the partial molal volume of component 2. A partial molal volume is a special case of the partial molar volume for 1 kg of solvent.

Procedure. A study on the partial molal volume of ZnCl2 solutions gave the following data (Alberty, 1987)

g ZnCl2 per kg of H2O

20:00

60:00

100:00

140:00

180:00

200:00

density

1:0167

1:0532

1:0891

1:1275

1:1665

1:1866

Calculate the number of moles of ZnCl2 per kilogram of water in each solution (the molality m). Calculate the volume V of solution containing 1 kg of water at each solute concentration. Plot V vs. m. Use program Mathcad, QQLSQ, or TableCurve

CURVE FITTING

79

(or all three, if available) to obtain a quadratic expression V ¼ a þ bm þ cm2. Obtain an expression for the slope dV/dm which is the same as ðqV=qn2Þn1 . This is the partial molal volume of ZnCl2. It is a partial volume because V varies with both nZnCl2 and the number of moles of water n1. What is the partial molal volume of ZnCl2 in water at 1.00 molal concentration? What is the partial molar volume of water at this concentration?

PROBLEMS

1.Select several values for m of the data set in Exercise 3-1 and calculate ðxi mÞ for each of them. Plot the curve of P ðxi 2 as a function of the selected parameter m and locate the minimum visually. Compare with Solution 3-1.

2.Obtain Eq. (3-8) from Eq. (3-6) through Eq. (3-7).

3.An excess of porphobilinogen in the urine is associated with hepatic disorders and lead poisoning. Porphobilinogen can be separated from other porphyrins by ion exchange chromatography and treated with p-dimethylaminobenzalde- hyde (PDMA) to produce a red compound that absorbs light strongly at 550 nm. A set of standard solutions was made up with concentrations of 50.0, 75.0, 100.0, 125.0, 150.0, 175.0, 200.0, 225.0, and 250.0 mg/100 mL of porphobilinogen. Their absorbances A after treatment with PDMA were 0.039, 0.061, 0.087, 0.107, 0.119, 0.163, 0.179, 0.194, and 0.213. What is the spectrophotometric calibration curve A ¼ f (concentration) for this method? What are the units of slope? Three urine specimens treated by this method yielded absorbances A of 0.180, 0.162, and 0.213. What were the porphobilinogen concentrations of these three samples?

4.Expand the three determinants D, Db, and Dm for the least squares fit to a linear function not passing through the origin so as to obtain explicit algebraic expressions for b and m, the y-intercept and the slope of the best straight line representing the experimental data.

5.Set up the determinants D, Da, Db, and Dc for the least squares fit to a parabolic curve not passing through the origin.

6.Expand the four 3 3 determinants obtained in Problem 5.

7.Using the expanded determinants from Problem 6, write explicit algebraic expressions for the three minimization parameters a, b, and c for a parabolic curve fit.

8.Compare the solution for Problems 6 and 7 with the BASIC statements in Program QQLSQ.BAS or QLLSQ.tru. They should agree.

9.Emsley, in the same paper referred to in Computer Project 3-1, presents viscosity measurements Z for solutions of KF in HOAc as a function of molality m with the following results

m

0:135

0:398

1:028

1:466

1:903

2:567

3:052

3:428

3:770

4:206

4:307

Z

1:38

1:77

3:69

5:97

8:68

16:34

29:29

41:53

58:57

92:73

106:36

80

COMPUTATIONAL CHEMISTRY USING THE PC

Carry out a statistical analysis of this data set including a fitting equation and all uncertainties.

10.Write the determinant for a sixth-degree curve fitting procedure.

11.The volume of ZnCl2 solutions containing 1000 g of water varies according to the quadratic equation (Computer Project 3-4)

V ¼ 999:71 þ 21:148 m þ 4:471 m2

Find the partial molal volume of ZnCl2 in these solutions at 0.5, 1.0, 1.5 and 2.0 molar concentrations.

Multivariate Least Squares Analysis

In multivariate least squares analysis, the dependent variable is a function of two or more independent variables. Because matrices are so conveniently handled by computer and because the mathematical formalism is simpler, multivariate analysis will be developed as a topic in matrix algebra rather than conventional algebra.

We have already seen the normal equations in matrix form. In the multivariate case, there are as many slope parameters as there are independent variables and there is one intercept. The simplest multivariate problem is that in which there are only two independent variables and the intercept is zero

y ¼ m1x1 þ m2x2

ð3-54Þ

To simplify the algebra, the error in the x variable will be considered negligible relative to the error in the y variable, although this is not a necessary condition. Equation (3-54) describes a plane passing through the origin. Let us restrict it to positive values of x1, x2, and y.

One measurement of the dependent variable yields y1 for known values of the independent variables x11; x12

y1

¼ m1x11

þ m2x12

ð3-55aÞ

and a second yields

 

y2

¼ m1x21

þ m2x22

ð3-55bÞ

for new values of the independent variables x21; x22.

The mathematical requirements for unique determination of the two slopes m1 and m2 are satisfied by these two measurements, provided that the second equation is not a linear combination of the first. In practice, however, because of experimental error, this is a minimum requirement and may be expected to yield the least reliable solution set for the system, just as establishing the slope of a straight line through the origin by one experimental point may be expected to yield the least reliable slope, inferior in this respect to the slope obtained from 2, 3, or p experimental points. In univariate problems, accepted practice dictates that we

CURVE FITTING

81

obtain many experimental points and determine the ‘‘best’’ slope representing them by a suitable univariate regression procedure.

The analogous procedure for a multivariate problem is to obtain many experimental equations like Eqs. (3-55) and to extract the best slopes from them by regression. Optimal solution for n unknowns requires that the slope vector be obtained from p equations, where p is larger than n, preferably much larger. When there are more than the minimum number of equations from which the slope vector is to be extracted, we say that the equation set is an overdetermined set. Clearly, n equations can be selected from among the p available equations, but this is precisely what we do not wish to do because we must subjectively discard some of the experimental data that may have been gained at considerable expense in time and money.

Equation set (3-55) can be written in matrix form

y ¼ Xm

ð3-56Þ

where X is the matrix of (known) input variables

X ¼

x11

x12

ð3-57Þ

x21

x22

y is the nonhomogeneous vector of dependent experimental measurements, and m is the slope vector, that is, the solution vector of the regression problem.

The deviations or residuals of y1 and y2 from the regression plane passing through the origin are

d1

¼ m1x11

þ m2x12

y1

 

 

 

 

ð3-58aÞ

d2

¼ m1x21

þ m2x22

y2

 

 

 

 

ð3-58bÞ

which are minimized by setting

q

 

P di2 and

q

 

P di2 equal to zero as shown in the

qm

1

qm

2

section on linear functions not passing through the origin. These minimizations yield

m1x112 þ m2x11x12 þ m1x212 þ m2x21x22 ¼ y1x11

þ y2x21

ð3-59aÞ

m1x11x12 þ m2x122 þ m1x21x22 þ m2x222 ¼ y1x12

þ y2x22

ð3-59bÞ

We are dealing with real numbers that commute; hence, it is evident that the right side of Equation set (3-59) is

x11

x21

y1

ð3-60Þ

x12

x22

y2

or

 

 

 

XTy

 

 

ð3-61Þ

where the matrix XT is the transpose of the input matrix X.

82

COMPUTATIONAL CHEMISTRY USING THE PC

The left side of the normal equations can be seen to be a product including X, its transpose, and m. Matrix multiplication shows that

XTXm

ð3-62Þ

is the matrix representation of the left side of the normal equations (see Problems). Two important facts emerge here. First, the method is general and can be worked out for the n n case ðn > 2Þ as above but with added labor. Second, the input matrix need not be square. By the geometric nature of a rectangular matrix, it is always conformable for multiplication into its own inverse. The result is a square product matrix of the smaller of the two dimensions of the rectangular input matrix. Indeed, for the present treatment to be nontrivial, the input matrix must be rectangular; a square input matrix with X 1 defined (X nonsingular) represents the problem of n independent equations in n unknowns, which is the problem we said we do not want to solve. From this point on, envision X as a p n matrix with

p > n.

The normal equations are simultaneous equations

XTXm ¼ XTy

ð3-63Þ

in which XTy yields a vector of the smaller dimension of XT, the same dimension as the vector obtained as the product XTXm. The normal equations are often written

XTX m ¼ q

ð3-64Þ

 

 

y.

where

XTX emerges as the coefficient matrix of a simple set of simultaneous

equations, m is the solution vector, and q is the nonhomogeneous vector XT Solution follows by the usual method of inverting the coefficient matrix and premultiplying it into both sides

XTX 1 XTX m ¼ XTX 1q

 

or

 

m ¼ XTX 1q

ð3-65Þ

This equation permits us to generate an n-fold slope vector m from a rectangular matrix X and a dependent variable vector y. The procedure is analogous to the univariate case of generating the slope of a calibration curve passing through the origin from p known values of xi and the corresponding measured values y1; . . . ; yi; . . . ; yp, with the intention of using m to determine unknown values of x (the concentration of an analyte perhaps) from future measurements of y.

Соседние файлы в предмете Химия