Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Казанский национальный исследовательский технологический университет

Предмет:

Химия

Файл:

Brereton Chemometrics

.pdf

Скачиваний:

Добавлен:

15.08.2013

Размер:

4.3 Mб

Скачать

☆

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 3940 / 5040 41 42 43 44 45 46 47 48 49 50 > Следующая >>>

382																					CHEMOMETRICS

		2.059	0.220	0.193	0.113	0.088	0.045	0.069	0.078	0.071	0.069	0.061	0.058	0.048	0.046	0.051	0.043	0.036	0.058	0.049	0.195	0.030	1.207	0.079	0.420	3.553
		2.059	0.220	0.193	0.113	0.088	0.045	0.069	0.078	0.071	0.069	0.061	0.058	0.048	0.046	0.051	0.043	0.036	0.058	0.049	0.195	0.030	1.207	0.079	0.420	3.553
																						− −		−
		−0.642	0.128	0.114	0.102	0.090	0.086	0.074	0.084	0.075	0.079	0.076	0.065	0.048	0.051	0.045	0.044	0.013	0.031	0.073 −0.089		0.115 −0.048		0.631	0.305	3.118
		−2.380	−0.005	−0.099	0.095	0.046	0.087	0.068	0.074	0.068	0.075	0.071	0.062	0.061	0.078	0.063	0.066	0.054	0.029	0.021 −0.217		0.245 −0.344		0.264	−0.116	0.395
		1.181	−0.012	0.118	0.050	0.042	0.069	0.066	0.047	0.056	0.062	0.070	0.077	0.104	0.111	0.109	0.118	0.114	0.076	0.252	0.096	0.394	0.070	0.079	−0.181	2.053
6.1.		1.694	0.498	−0.104	−0.044	0.056	0.027	0.041	0.030	0.045	0.043	0.071	0.101	0.134	0.143	0.158	0.157	0.200	0.172	0.168	0.750	−1.732	−0.708	0.373	0.090	−3.461
datainTable		−1.911	0.045	−0.246	0.037	−0.001	0.039	0.046	0.032	0.046	0.057	0.081	0.110	0.145	0.160	0.184	0.191	0.205	0.211	0.275	0.135	−0.387	0.885	0.287	−0.009	−1.803
regionsinthe		1.240	−0.136	0.212	0.143	0.064	0.060	0.047	0.061	0.057	0.064	0.086	0.112	0.134	0.154	0.163	0.169	0.163	0.175	0.393	0.297	0.094	1.323	−0.553	0.466	3.105
purityof		0292.	0260.	0155.	0014.	0097.	0066.	0075.	0097.	0093.	0097.	0095.	0108.	0105.	0114.	0110.	0108.	0097.	0113.	−0018.	0439.	1151.	1238.	0217.	0281.	−3829.
Table6.8 Derivativecalculationfordeterminingthe	(a)Scalingtherowstoconstanttotal	1 −4.066 2.561 3.269 −2.295	2 −0.176 0.000 0.183 −0.005	3 0.145 0.111 0.404 −0.004	4 0.157 0.117 0.136 0.080	5 0.070 0.130 0.175 0.143	6 0.101 0.126 0.150 0.143	7 0.084 0.139 0.164 0.126	8 0.093 0.135 0.146 0.123	9 0.087 0.123 0.152 0.126	10 0.081 0.120 0.124 0.127	11 0.060 0.102 0.115 0.113	12 0.046 0.075 0.081 0.104	13 0.034 0.042 0.064 0.080	14 0.010 0.025 0.039 0.068	15 0.008 0.020 0.024 0.065	16 0.008 −0.004 0.033 0.066	17 0.032 −0.003 0.028 0.061	18 0.004 −0.014 0.050 0.096	19 0.023 −0.091 −0.108 −0.038	20 0.008 −0.027 −0.230 −0.358	21 1.664 −0.575 −0.659 0.719	22 −0.057 0.235 0.597 −0.985	23 −0.422 −0.045 0.191 0.057	24 −0.221 0.037 −0.094 0.023	25 −12.974 3.211 2.434 5.197
											10 0.081 0.120 0.124 0.127	11 0.060 0.102 0.115 0.113	12 0.046 0.075 0.081 0.104	13 0.034 0.042 0.064 0.080	14 0.010 0.025 0.039 0.068	15 0.008 0.020 0.024 0.065	16 0.008 −0.004 0.033 0.066	17 0.032 −0.003 0.028 0.061	18 0.004 −0.014 0.050 0.096	19 0.023 −0.091 −0.108 −0.038	20 0.008 −0.027 −0.230 −0.358	21 1.664 −0.575 −0.659 0.719	22 −0.057 0.235 0.597 −0.985	23 −0.422 −0.045 0.191 0.057	24 −0.221 0.037 −0.094 0.023	25 −12.974 3.211 2.434 5.197

EVOLUTIONARY SIGNALS																								383

		0.4049	0.0455	0.0316	0.0089		0.0000	0.0049	0.0027	0.0051	0.0058	0.0058	0.0031	0.0026	0.0026	0.0008	0.0010	0.0317	0.0005	0.2609	0.1658	0.0401	0.8792		overleaf)
		0.4049	0.0455	0.0316	0.0089		0.0000	0.0049	0.0027	0.0051	0.0058	0.0058	0.0031	0.0026	0.0026	0.0008	0.0010	0.0317	0.0005	0.2609	0.1658	0.0401	0.8792		overleaf)
		0.1439	0.0108	0.0095	0.0052		0.0032	0.0013	0.0002	0.0039	0.0070	0.0085	0.0075	0.0044	0.0076	0.0072	0.0043	0.0206	0.0085	0.0115	0.1157	0.1303	0.6359		(continued
		0.4953	0.0330	0.0326	0.0020		0.0031	0.0024	0.0006	0.0021	0.0026	0.0004	0.0001	0.0009	0.0025	0.0106	0.0122	0.0600	0.0135	0.0521	0.0360	0.0220	0.0527
		0.2216	0.0087	0.0085	0.0017		0.0007	0.0024	0.0025	0.0075	0.0111	0.0131	0.0111	0.0087	0.0027	0.0064	0.0245	0.0095	0.0581	0.0130	0.0374	0.0871	0.3065
	function	0.3818	0.0783	0.0362	0.0133		0.0018	0.0036	0.0072	0.0169	0.0235	0.0263	0.0217	0.0135	0.0146	0.0100	0.0035	0.1153	0.3287	0.3660	0.1047	0.0785	0.2660
	smoothing	0.3814	0.0235	0.0586	0.0036		0.0085	0.0036	0.0096	0.0193	0.0253	0.0271	0.0256	0.0200	0.0150	0.0121	0.0202	0.0042	0.1260	0.0687	0.0773	0.0386	0.3725
	quadratic	0.2073	0.0244	0.0414	0.0180		0.0013	0.0020	0.0081	0.0131	0.0202	0.0228	0.0197	0.0143	0.0073	0.0042	0.0465	0.0485	0.0017	0.1998	0.0865	0.0309	0.5164
	Savitsky–Golay	0.0636	0.0448	0.0108	0.0144		0.0023	0.0081	0.0041	0.0025	0.0035	0.0043	0.0035	0.0004	0.0023	0.0015	0.0251	0.0548	0.2434	0.3419	0.1269	0.1250	1.0917
	ﬁve-point	0.4962	0.0444	0.0323	0.0070		0.0055	0.0032	0.0023	0.0053	0.0116	0.0151	0.0131	0.0090	0.0041	0.0050	0.0177	0.0947	0.0864	0.1404	0.0438	0.0098	0.9964
	ﬁrstderivativeusinga	0.4745 0.6236	0.0270 0.0296	0.0065 0.0466	0.0046 0.0009		0.0006 0.0049	0.0028 0.0064	0.0090 0.0120	0.0142 0.0168	0.0207 0.0221	0.0250 0.0222	0.0213 0.0224	0.0180 0.0134	0.0120 0.0077	0.0101 0.0025	0.0231 0.0247	0.0134 0.0661	0.1158 0.1653	0.0013 0.0543	0.0354 0.1423	0.0660 0.1121	0.7375 0.5495
	Absolutevalueof	0.8605	0.0480	0.0178	0.0114		0.0026	0.0038	0.0059	0.0119	0.0140	0.0167	0.0141	0.0103	0.0007	0.0011	0.0027	0.0008	0.3269	0.1518	0.0956	0.2543	2.9440
	(b)	1 2 3 4		5	6	7		8 9		10	11	12	13	14	15	16	17	18	19	20	21	22	23	24 25
										10

384																		CHEMOMETRICS

		0.079	0.055	0.015	0.000	0.008	0.005	0.009	0.010	0.010	0.005	0.005	0.005	0.001	0.002	0.055	0.001	0.450	0.286
		0.079	0.055	0.015	0.000	0.008	0.005	0.009	0.010	0.010	0.005	0.005	0.005	0.001	0.002	0.055	0.001	0.450	0.286
		0.046	0.040	0.022	0.013	0.005	0.001	0.016	0.029	0.036	0.032	0.019	0.032	0.030	0.018	0.087	0.036	0.049	0.489
		0.124	0.122	0.007	0.012	0.009	0.002	0.008	0.010	0.001	0.001	0.003	0.009	0.040	0.046	0.225	0.051	0.195	0.135
		0.038	0.037	0.008	0.003	0.010	0.011	0.033	0.049	0.058	0.049	0.038	0.012	0.028	0.108	0.042	0.256	0.057	0.164
		0.066	0.031	0.011	0.002	0.003	0.006	0.014	0.020	0.022	0.018	0.011	0.012	0.008	0.003	0.097	0.277	0.309	0.088
	scale	0.043	0.107	0.007	0.016	0.007	0.018	0.035	0.046	0.049	0.047	0.037	0.027	0.022	0.037	0.008	0.230	0.125	0.141
	common	0.042	0.071	0.031	0.002	0.003	0.014	0.023	0.035	0.039	0.034	0.025	0.013	0.007	0.080	0.084	0.003	0.345	0.149
	themeasurementsona	0.082 0.050	0.060 0.012	0.013 0.016	0.010 0.003	0.006 0.009	0.004 0.005	0.010 0.003	0.021 0.004	0.028 0.005	0.024 0.004	0.017 0.000	0.008 0.003	0.009 0.002	0.033 0.028	0.175 0.061	0.160 0.272	0.260 0.382	0.081 0.142
	andputting	0.045	0.071	0.001	0.007	0.010	0.018	0.025	0.033	0.034	0.034	0.020	0.012	0.004	0.037	0.100	0.250	0.082	0.216
Table6.8 (continued)	(c)Rejectingpoints3,22and23	1 2 3 4 0.065 0.075	5 0.024 0.018	6 0.015 0.013	7 0.003 0.002	8 0.005 0.008	9 0.008 0.025	10 0.016 0.039	11 0.019 0.057	12 0.023 0.069	13 0.019 0.059	14 0.014 0.050	15 0.001 0.033	16 0.001 0.028	17 0.004 0.064	18 0.001 0.037	19 0.444 0.321	20 0.206 0.004	21 0.130 0.098	22 23 24 25
								10 0.016 0.039	11 0.019 0.057	12 0.023 0.069	13 0.019 0.059	14 0.014 0.050	15 0.001 0.033	16 0.001 0.028	17 0.004 0.064	18 0.001 0.037	19 0.444 0.321	20 0.206 0.004	21 0.130 0.098	22 23 24 25

EVOLUTIONARY SIGNALS	385

(d) Calculating the ﬁnal consensus derivative

i d

0.063

0.054

0.013

0.006

0.007

0.010

0.019

0.028

0.031

0.027

0.020

0.014

0.015

0.038

0.081

0.192

0.205

0.177

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

386					CHEMOMETRICS

			Datapoint
1
1
0	5	10	15	20	25

0.1

0.01

0.001

Figure 6.29

Derivative purity plot for the data in Table 6.1 with the purest points indicated

regions of differing composition, but the visual display is often very informative and can cope well with unusual peakshapes.

6.4 Resolution

Resolution or deconvolution of two-way chromatograms or mixture spectra involves converting a cluster of peaks into its constituent parts, each ideally representing a component of the signal from a single compound. The number of named methods in the literature is enormous, and it would be completely outside the scope of this text to discuss each approach in detail. In areas such as chemical pattern recognition or calibration, certain generic approaches are accepted as part of an overall strategy and the data preprocessing, variable selection, etc., are regarded as extra steps. In the ﬁeld of resolution of evolutionary data, there is a fondness for packaging a series of steps into a named method, so there are probably 20 or more named methods, and maybe as many unnamed approaches reported in the literature. However, most are based on a number of generic principles, which are described in this chapter.

There are several aims for resolution.

1.Obtaining the proﬁles for each resolved compound. These might be the elution proﬁles (in chromatography) or the concentration distribution in a series of compounds (in spectroscopy of mixtures) or the pH proﬁles of different chemical species.

EVOLUTIONARY SIGNALS	387

2.Obtaining the spectra of each pure compound. This allows identiﬁcation or library searching. In some cases, this procedure merely uses the multivariate signals to improve on the quality of the individual spectra, which may be noisy, but in other cases, such as an embedded peak, genuinely difﬁcult information can be gleaned. This is particularly useful in impurity monitoring.

3.Obtaining quantitative information. This involves using the resolved two-way data to provide concentrations (or relative concentrations when pure standards are not available).

4.Automation. Complex chromatograms may consist of 50 or more peaks, some of which will be noisy and overlapping. Speeding up procedures, for example, using rapid chromatography in a matter of minutes resulting in considerable overlap, rather than taking 30 min per chromatogram, also results in embedded peaks. Chemometrics can ideally pull out the constituents’ spectra and proﬁles.

The methods in this chapter differ from those in Chapter 5 in that pure standards are not required for the model.

Whereas some datasets can be very complicated, it is normal to divide the data into small regions where there are signals from only a few components. Even in the spectroscopy of mixtures, in many cases such as MIR or NMR it is normally easy to ﬁnd regions of the spectra where only two or three compounds at the most absorb, so this process of ﬁnding windows rather than analysing an entire dataset in one go is normal. Hence we will limit the discussion to three peak clusters in this section. Naturally the methods in Section 6.3 would usually ﬁrst be applied to the entire dataset to identify these regions. We will illustrate the discussion below primarily in the context of coupled chromatography.

6.4.1 Selectivity for All Components

These methods involve ﬁrst ﬁnding some pure or selective (composition 1) region in the chromatogram or selective spectral measurement such as an m/z value for each compound in a mixture.

6.4.1.1 Pure Spectra and Selective Variables

The most straightforward situation is when each compound has a composition 1 region. The simplest approach is to estimate the pure spectrum in such a region. There are several methods.

1.Take the spectrum at the point of maximum purity for each compound.

2.Average the spectra for each compound over each composition 1 region.

3.Perform PCA over each composition 1 region separately (so if there are three compounds, perform three PCA calculations) and then take the loadings of the ﬁrst PC as an estimate of the pure spectrum. PCA is used as a smoothing technique, the idea being that the noise is banished to later PCs.

Some rather elaborate multivariate methods are also available that, instead of using the spectra in the composition 1 regions, use the elution proﬁles. In the case of Table 6.1 we might guess that the fastest eluting compound A has a composition 1 region between

388	CHEMOMETRICS

points 4 and 8, and the slowest eluting B between points 15 and 19. Hence we could divide up the chromatogram as follows.

1.points 1–3: no compounds elute;

2.points 4–8: compound A elutes selectively;

3.points 9–14: co-elution;

4.points 15–19: compound B elutes selectively;

5.points 20–25: no compounds elute.

As discussed above, there can be slight variations on this theme. This is represented in Figure 6.30. Chemometrics is used to ﬁll in the remaining pieces of the jigsaw. The only unknowns are the elution proﬁles in the composition 2 regions. The proﬁles in the composition 1 regions can be estimated either by using the summed proﬁles or by performing PCA in these regions and taking the scores of the ﬁrst PC.

An alternative is to ﬁnd pure variables rather than composition 1 regions. These methods are popular when using various types of spectroscopy such as in LC–MS or in the MIR of mixtures. Wavelengths, frequencies or masses belonging to single compounds can often be identiﬁed. In the case of Table 6.3, we suspect that variables C and F are diagnostic of the two compounds (see Figure 6.16), and their proﬁles are presented in Figure 6.31. Note that these proﬁles are somewhat noisy. This is fairly common in techniques such as mass spectrometry. It is possible to improve the quality of the proﬁles by using methods for smoothing as described in Chapter 3, or to average proﬁles from several pure variables. The latter technique is useful in NMR or IR spectroscopy where a peak might be deﬁned by several datapoints, or where there could be a number of selective regions in the spectrum.

The result of this section will be to produce either a ﬁrst guess of all or part of the

concentration proﬁles, represented by the matrix	ˆ	or of the spectra	ˆ .
	C		S

6.4.1.2 Multiple Linear Regression

If pure proﬁles can be obtained from all components, the next step in deconvolution is straightforward.

In the case of Table 6.1, we can guess the pure spectra for A as the average of the data between times 4 and 8, and for B as the average between times 15 and 19. These

Compound A

Compound B

Composition

Figure 6.30

Composition of regions in chromatogram deriving from Table 6.1

EVOLUTIONARY SIGNALS					389
	8
	7			F
	6
	5
					C
	4
Intensity	3
Intensity
	2
	1
	0
	1	6	11	16	21
	−1			Datapoint
	−2

Figure 6.31

Proﬁles of variables C and F in Table 6.3

make up a 2 × 12 data matrix	ˆ	. Since
	S

≈ ˆ ˆ

X C.S

therefore

ˆ	=	ˆ		ˆ ˆ	)−1
C		X.S		.(S.S	)−1

as discussed in Chapter 5 (Section 5.3). The estimated spectra are listed in Table 6.9 and the resultant proﬁles are presented in Figure 6.32. Note that the vertical scale in fact has no direct physical meaning: intensity data can only be reconstructed by multiplying the proﬁles by the spectra. However, MLR has provided a very satisfactory estimate, and provided that pure regions are available for each signiﬁcant component, is probably entirely adequate as a tool in many cases.

If pure variables such as spectral frequencies or m/z values can be determined, even if there are embedded peaks, it is also possible to use these to obtain ﬁrst estimates of

Table 6.9 Estimated spectra obtained from the composition 1 regions in the example of Table 6.1.

A	B	C	D	E	F	G	H	I	J	K	L

0.519	0.746	0.862	0.713	0.454	0.341	0.194	0.176	0.312	0.410	0.465	0.404
0.041	0.006	0.087	0.221	0.356	0.603	0.676	0.575	0.395	0.199	0.136	0.162

390	CHEMOMETRICS

Intensity

2.5

2.0

1.5

1.0

0.5

0.0

Datapoint

−0.5

Figure 6.32

Reconstructed proﬁles for the data in Table 6.1 using MLR

elution proﬁles, C, then the spectra can be obtain using all (or a great proportion of) the variables by

ˆ	=	ˆ		ˆ	ˆ	.X
S		(C		.C)−1	.C	.X

The concentration proﬁle can be improved by increasing the variables; so, for example, the ﬁrst guess might involve using one variable per compound, the next 20 signiﬁcant variables and the ﬁnal 100 or more. This approach is also useful in spectroscopy of mixtures, if pure frequencies can be identiﬁed for each compound. Using these for initial estimates of the concentrations of each compound in each spectrum, the full spectra can be reconstructed even when there are overlapping regions. Such approaches are useful in MIR, but not so valuable in NIR or UV/vis spectroscopy where it is often hard to ﬁnd selective wavelengths and the effectiveness depends on the type of spectroscopy employed.

6.4.1.3 Principal Components Regression

PCR is an alternative to MLR (Section 5.4) and can be used in signal analysis just as in calibration. There are a number of ways of employing PCA, but a simple approach is to note that the scores and loadings can be related to the concentration proﬁle and spectra by

X	≈ ˆ ˆ	=	T .R.R−1.P
	C.S

EVOLUTIONARY SIGNALS	391

hence

ˆ =

C T .R

and

ˆ	=	R−1.P
S		R−1.P

If we perform PCA on the dataset, and know the pure spectra, it is possible to ﬁnd the matrix R−1 simply by regression since

R−1	= ˆ
	S.P

[because the loadings are orthonormal (Chapter 4, Section 4.3.2) this equation is sim-

ple]. It is then easy to obtain C. This procedure is illustrated in Table 6.10 using the spectra as obtained from Table 6.9. The proﬁles are very similar to those presented in Figure 6.32 and so are not presented graphically for brevity.

PCR can be employed in more elaborate ways using the known proﬁles in the composition 1 (and sometimes composition 0) region for each compound. These methods were the basis of some of the earliest approaches to resolution of two-way chromatographic data. There are several variants, and one is as follows.

1.Choose only those regions where one component elutes. In our example in Table 6.1, we will use the regions between times 4–8 and 15–19 inclusive, which involves 10 points.

2.For each compound, use either the estimated proﬁles if the region is composition 1 or 0 if another compound elutes in this region. A matrix is obtained of size Z × 2

whose columns correspond to each component, where Z equals the total number of composition 1 datapoints. In our example, the matrix is of size 10 × 2, half of the values being 0 and half consisting of the proﬁle in the composition 1 region. Call this matrix Z.

3.Perform PCA on the overall matrix.

4.Find a matrix R such that Z ≈ T .R using the known proﬁles obtained in step 2,

simply by using regression so that R = (T .T )−1.T .Z but including the scores only of the composition 1 region.

5.Knowing R, it is a simple matter to reconstruct the concentration proﬁles by including the scores over the entire data matrix as above, and similarly the spectra.

The key steps in the calculation are presented in Table 6.11. Note that the magnitude of the numbers in the matrix R differ from those presented in Table 6.10. This is simply because the magnitudes of the estimates of the spectra and proﬁles are different, and have no physical signiﬁcance. The resultant proﬁles obtained by the multiplication

ˆ =

C T .R on the entire dataset are illustrated in Figure 6.33.

In straightforward cases, PCR is unnecessary and if not carefully controlled may provide worse results than MLR. However, for more complex systems it can be very useful.

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 3940 / 5040 41 42 43 44 45 46 47 48 49 50 > Следующая >>>

Соседние файлы в предмете Химия

#
15.08.20134.29 Mб17Baer M., Billing G.D. (eds.) - The role of degenerate states in chemistry (Adv.Chem.Phys. special issue, Wiley, 2002).pdf
#
15.08.20137.08 Mб55Basov N.I. i dr. Raschet i konstruirovanie formiruyushchego instrumenta dlya izgotovleniya izdelij (1991.pdf
#
15.08.20135.59 Mб68Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
#
15.08.2013324.82 Кб32benzyne-cyclization.pdf
#
15.08.201314.48 Mб18Borowko M. 2000 Computational methods in surface and colloid science.djvu
#
15.08.20134.3 Mб48Brereton Chemometrics.pdf
#
15.08.20131.07 Mб30Burshtejn K.Ya., Shorygin P.P. Kvantovohimicheskie raschety v organicheskoj himii i molekulyarnoj.pdf
#
15.08.201321.36 Mб45Carey F.A. - Organic Chemistry (2004)(en).djvu
#
15.08.201321.36 Mб39Carey F.A. Advanced organic chemistry 5ed., MGH, 2004.djvu
#
15.08.201311.62 Mб23Carey F.A. Advanced organic chemistry. Part A structure and mechanisms 1938.djvu
#
15.08.20138.77 Mб17Carey F.A. Advanced organic chemistry. Part B reaction and synthesis 1938.djvu