Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Казанский национальный исследовательский технологический университет

Предмет:

Химия

Файл:

Brereton Chemometrics

.pdf

Скачиваний:

Добавлен:

15.08.2013

Размер:

4.3 Mб

Скачать

☆

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 4142 / 5042 43 44 45 46 47 48 49 50 > Следующая >>>

402

Problem 6.3

																																								CHEMOMETRICS
1002	236	219	67	104	739	463	650	385	163	754	925	3313	194	96	92	2926	8036	872	92	112	249	166	300	467	281	1113	160	113	524	2813	783	221	219	108	378	173	115	874	1388	88	259	147	206	773	305	54	159	49
1146	638	354	76	128	623	173	568	403	265	1028	123	2996	100	66	38	3073	10484	1439	116	128	443	184	192	525	261	880	168	83	712	4982	268	87	186	0	245	157	141	765	1434	11	337	129	163	679	217	11	249	29
527	0	157	64	208	658	364	479	320	103	132	901	4279	158	115	40	2416	7375	704	99	76	170	125	350	163	235	0	154	103	325	2472	768	143	213	25	262	164	133	967	1250	51	320	112	140	920	223	88	220	25
919	288	158	34	135	299	194	364	436	199	726	340	3703	139	131	187	3468	7432	828	116	10	233	130	300	149	299	758	135	104	228	3323	303	184	177	39	363	196	196	779	1375	83	390	189	180	838	318	57	140	38
1146	191	141	24	65	41	179	612	430	152	584	125	3714	139	121	198	2273	7258	965	132	76	89	149	164	124	149	1050	147	116	254	3904	458	172	185	19	351	179	87	796	1408	54	377	220	180	815	206	45	20	64
1173	335	91	8	107	558	106	324	514	212	882	811	4122	135	86	173	4273	11437	1322	117	85	280	205	310	291	211	331	89	120	301	4391	574	146	148	27	304	197	106	1044	1427	50	301	101	253	1005	302	0	166	29
1207	284	102	46	110	375	70	494	425	180	580	639	4836	113	143	125	3103	9768	1250	130	49	337	187	250	431	172	766	138	69	491	2687	437	166	128	27	320	149	103	907	1708	61	238	161	183	1122	204	96	166	37
1181	239	110	68	158	803	310	632	263	84	1136	759	4824	134	50	59	5535	9069	1201	84	0	160	131	292	28	269	914	184	43	296	1753	518	133	86	76	343	110	112	1046	1720	91	314	184	218	1310	221	71	150	70
1402	315	137	91	77	234	98	600	438	72	608	473	5054	102	92	117	5025	11006	1191	102	6	282	117	263	345	161	849	149	93	209	2322	681	176	138	50	217	100	129	1224	1909	69	172	137	202	1183	98	52	147	48
1571	67	180	73	134	299	252	771	234	196	128	449	4688	128	42	121	4203	11459	1624	69	69	335	127	238	340	199	740	122	40	504	3113	74	66	57	10	123	52	104	1200	2036	52	155	124	173	1447	221	63	146	122
1492	72	265	7	132	537	225	749	258	268	574	665	3225	110	61	115	4515	11816	1766	81	43	496	133	254	302	109	645	9	89	377	1717	278	42	85	13	110	81	60	1566	1819	63	147	98	139	1467	180	11	246	46
1148	106	262	63	99	240	322	594	125	160	506	906	4396	111	9	104	4550	12833	1528	85	7	317	91	231	430	127	673	83	57	548	2470	151	111	90	91	178	65	86	1358	2031	74	112	19	179	1657	95	47	184	35
1668	283	162	75	69	0	224	648	292	105	56	301	4476	15	5	61	4648	12832	1717	46	96	333	129	98	296	44	770	98	65	512	2492	0	54	47	28	53	28	43	1444	1886	42	180	58	123	1390	64	15	140	8
1770	286	517	59	98	222	363	523	131	124	325	323	4564	46	15	17	5014	13212	1426	41	138	318	54	43	348	35	868	9	48	372	1531	352	27	32	57	151	11	35	1478	2232	57	89	35	105	1328	78	124	211	17
1678	34	325	88	13	379	180	550	57	0	173	592	3830	36	0	0	4817	9457	1507	46	36	687	61	91	552	18	507	0	0	422	1527	606	0	0	71	0	13	19	1678	1929	43	29	0	107	1567	51	115	257	0
1860	176	455	83	0	39	130	661	108	104	405	294	4479	0	5	25	4578	12223	1613	59	29	396	61	0	294	16	884	11	27	524	2608	190	11	14	66	95	4	0	1737	2259	28	0	68	0	1575	6	145	206	45
1423	162	341	52	45	499	206	722	0	109	398	758	2633	40	2	5	4589	12855	1515	57	18	380	0	86	50	0	710	36	5	436	892	485	21	40	96	51	0	33	1806	2013	46	75	21	148	1321	20	172	236	5
1694	164	328	39	35	139	162	1260	124	118	130	289	2733	47	33	23	5060	14185	1437	56	47	407	70	122	416	117	906	53	40	359	3372	412	22	74	9	38	62	27	2027	1988	5	77	73	141	1261	37	202	151	26
1831	69	158	3	30	19	0	1238	233	16	0	41	1713	81	72	45	4551	11352	1170	0	67	203	114	214	213	115	766	25	43	0	485	557	145	151	435	269	104	119	2111	1900	27	160	85	43	1141	0	71	0	63
1620	119	0	0	110	558	131	1530	259	278	453	54	671	126	48	92	5536	13709	1309	104	105	184	90	231	110	134	538	85	90	103	1492	72	57	139	13	243	138	104	2625	1904	41	282	93	132	876	159	28	50	33
1917	229	178	39	11	43	14	1626	550	314	498	0	271	107	68	40	3545	13004	1489	83	70	0	191	122	0	166	793	57	82	235	1256	123	71	237	45	256	158	73	2516	1670	0	286	163	92	633	164	82	77	44
1286	245	188	73	88	706	375	919	367	153	589	1094	0	120	94	46	3216	9371	1123	112	97	212	108	339	206	155	669	135	44	249	0	68	184	189	107	332	124	104	1782	1823	54	196	207	229	345	195	125	115	39
905	386	353	60	107	593	515	770	440	427	1114	622	73	98	89	139	2172	5742	377	121	198	448	247	295	357	168	675	102	91	555	4081	222	15	149	194	187	191	133	1225	1179	45	231	132	206	137	184	160	250	54
334	578	567	111	83	1143	278	430	562	135	1256	1150	256	123	59	108	721	2246	631	113	289	109	212	284	855	177	861	223	76	598	3865	234	152	262	295	304	190	80	497	423	19	396	146	207	0	187	415	178	97
0	530	242	66	165	1111	360	0	594	287	962	1225	35	246	131	188	0	0	0	132	221	470	274	527	484	263	663	217	174	441	2452	1118	216	286	283	553	360	204	0	0	101	547	269	334	58	380	200	109	92
41	42	44	50	54	59	61	68	69	72	76	77	78	82	86	94	95	96	97	104	108	109	110	118	119	126	127	128	131	135	136	137	138	140	141	142	144	149	154	155	158	167	168	169	172	195	196	197	198

EVOLUTIONARY SIGNALS									403

(continued from p. 401)
0.159	0.281	0.431	0.192	0.488	0.335	0.196	0.404	0.356	0.265
0.076	0.341	0.629	0.294	0.507	0.442	0.252	0.592	0.352	0.196
0.138	0.581	0.883	0.351	0.771	0.714	0.366	0.805	0.548	0.220
0.223	0.794	1.198	0.543	0.968	0.993	0.494	1.239	0.766	0.216
0.367	0.865	1.439	0.562	1.118	1.130	0.578	1.488	0.837	0.220
0.310	0.995	1.505	0.572	1.188	1.222	0.558	1.550	0.958	0.276
0.355	0.895	1.413	0.509	1.113	1.108	0.664	1.423	0.914	0.308
0.284	0.723	1.255	0.501	0.957	0.951	0.520	1.194	0.778	0.219
0.350	0.593	0.948	0.478	0.738	0.793	0.459	0.904	0.648	0.177
0.383	0.409	0.674	0.454	0.555	0.629	0.469	0.684	0.573	0.126
0.488	0.220	0.620	0.509	0.494	0.554	0.580	0.528	0.574	0.165
0.695	0.200	0.492	0.551	0.346	0.454	0.695	0.426	0.584	0.177
0.877	0.220	0.569	0.565	0.477	0.582	0.747	0.346	0.685	0.168
0.785	0.230	0.486	0.724	0.346	0.601	0.810	0.370	0.748	0.147
0.773	0.204	0.435	0.544	0.321	0.442	0.764	0.239	0.587	0.152
0.604	0.141	0.417	0.504	0.373	0.458	0.540	0.183	0.504	0.073
0.493	0.083	0.302	0.359	0.151	0.246	0.449	0.218	0.392	0.110
0.291	0.050	0.096	0.257	0.034	0.199	0.238	0.142	0.271	0.018
0.204	0.034	0.126	0.097	0.092	0.095	0.215	0.050	0.145	0.034

The aim of this problem is to explore different approaches to signal resolution using a variety of common chemometric methods.

1.Plot a graph of the sum of intensities at each point in time. Verify that it looks as if there are three peaks in the data.

2.Calculate the derivative of the spectrum, scaled at each point in time to a constant sum, and at each wavelength as follows.

a.Rescale the spectrum at each point in time by dividing by the total intensity at that point in time so that the total intensity at each point in time equals 1.

b.Then calculate the smoothed ﬁve point quadratic Savitsky–Golay ﬁrst derivatives as presented in Chapter 3, Table 3.6, independently for each of the 10 wavelengths. A table consisting of derivatives at 26 times and 10 wavelengths should be obtained.

c.Superimpose the 10 graphs of derivatives at each wavelength.

3.Summarise the change in derivative with time by calculating the mean of the abso-

lute value of the derivative over all 10 wavelengths at each point in time. Plot a graph of this, and explain why a value close to zero indicates a good pure or composition 1 point in time. Show that this suggests that points 6, 17 and 26 are good estimates of pure spectra for each component.

4.The concentration proﬁles of each component can be estimated using MLR as follows. a. Obtain estimates of the spectra of each pure component at the three points of

highest purity, to give an estimated spectral matrix S .

ˆ = ˆ ˆ ˆ −1 b. Using MLR calculate C X .S .(S .S ) .

c. Plot a graph of the predicted concentration proﬁles

5.An alternative method is PCR. Perform uncentred PCA on the raw data matrix X and verify that there are approximately three components.

404			CHEMOMETRICS

6. Using estimates of each pure component given	in question 4(a), perform PCR
as follows.		−1
ˆ			.P where P is the loadings
a. Using regression ﬁnd the matrix R for which S = R
matrix obtained in question 5; keep three PCs only.
		ˆ
b. Estimate the elution proﬁles of all three peaks since C ≈ T .R.
c. Plot these graphically.
Problem 6.5 Titration of Three Spectroscopically Active Compounds with pH
Section 6.2.2	Section 6.3.3 Section 6.4.1.2

The data in the table on page 405 represent the spectra of a mixture of three spectroscopically active species recorded at 25 wavelengths over 36 different values of pH.

1.Perform PCA on the raw uncentred data, and obtain the scores and loadings for the ﬁrst three PCs.

2.Plot a graph of the loadings of the ﬁrst PC and superimpose this on the graph of the average spectrum over all the observed pHs, scaling the two graphs so that they are of approximately similar size. Comment on why the ﬁrst PC is not very useful for discriminating between the compounds.

3.Calculate the logarithm of the correlation coefﬁcient between each successive spectrum, and plot this against pH (there will be 35 numbers; plot the logarithm of the correlation between the spectra at pH 2.15 and 2.24 against the lower pH). Show how this is consistent with there being three different spectroscopic species in the mixture. On the basis of three components, are there pure regions for each components, and over which pH ranges are these?

4.Centre the data and produce three scores plots, those of PC2 vs PC1, PC3 vs PC1 and PC3 vs PC2. Label each point with pH (Excel users will have to adapt the macro provided). Comment on these plots, especially in the light of the correlation graph in question 3.

5.Normalise the scores of the ﬁrst two PCs obtained in question 4 by dividing by the square root of the sum of squares at each pH. Plot the graph of the normalised scores of PC2 vs PC1, labelling each point as in question 4, and comment.

6.Using the information above, choose one pH which best represents the spectra for each of the three compounds (there may be several answers to this, but they should not differ by a great deal). Plot the spectra of each pure compound, superimposed on one another.

7.Using the guesses of the spectra for each compound in question 7, perform MLR

ˆ = −1

to obtain estimated proﬁles for each species by C X .S .(S .S ) . Plot a graph of the pH proﬁles of each species.

Problem 6.6 Resolution of Mid-infrared Spectra of a Three-component Mixture

Section 6.2.2 Section 6.2.3.1 Section 6.4.1.2

The table on page 406 represents seven spectra consisting of different mixtures of three compounds, 1,2,3-trimethylbenzene, 1,3,5-trimethylbenzene and toluene, whose midinfrared spectra have been recorded at 16 cm−1 intervals between 528 and 2000 nm, which you will need to reorganise as a matrix of dimensions 7 × 93.

EVOLUTIONARY SIGNALS	405

Problem 6.5

	336	0.000	0.000	0.000	0.001	0.001	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.003	0.003	0.003	0.004	0.004	0.005	0.004	0.005	0.005	0.005	0.007	0.006	0.006	0.006	0.006	0.007	0.007	0.007	0.008	0.008
	332	0.000	0.000	0.001	0.001	0.001	0.002	0.002	0.002	0.001	0.001	0.002	0.002	0.002	0.002	0.002	0.001	0.003	0.003	0.003	0.003	0.005	0.004	0.004	0.005	0.004	0.005	0.007	0.006	0.006	0.006	0.007	0.007	0.007	0.008	0.008	0.008
	328	0.000	0.001	0.001	0.001	0.001	0.003	0.003	0.002	0.002	0.002	0.002	0.002	0.002	0.003	0.003	0.003	0.003	0.004	0.004	0.004	0.006	0.005	0.005	0.005	0.006	0.005	0.006	0.006	0.006	0.007	0.007	0.007	0.007	0.008	0.008	0.008
	324	0.001	0.001	0.001	0.002	0.001	0.003	0.003	0.003	0.003	0.003	0.002	0.002	0.003	0.003	0.003	0.003	0.003	0.003	0.004	0.004	0.006	0.005	0.005	0.006	0.006	0.006	0.006	0.007	0.007	0.007	0.007	0.008	0.009	0.008	0.008	0.009
	320	0.002	0.002	0.002	0.003	0.002	0.005	0.005	0.004	0.003	0.003	0.003	0.003	0.004	0.004	0.003	0.004	0.004	0.004	0.005	0.005	0.006	0.005	0.005	0.006	0.006	0.007	0.007	0.007	0.007	0.008	0.008	0.008	0.009	0.009	0.009	0.009
	316	0.005	0.005	0.004	0.005	0.004	0.005	0.005	0.005	0.004	0.004	0.004	0.004	0.004	0.005	0.005	0.004	0.004	0.005	0.005	0.005	0.006	0.006	0.006	0.006	0.006	0.006	0.007	0.007	0.007	0.008	0.008	0.008	0.008	0.008	0.009	0.010
	312	0.006	0.006	0.005	0.006	0.006	0.007	0.007	0.007	0.005	0.005	0.005	0.005	0.005	0.004	0.004	0.004	0.005	0.006	0.005	0.006	0.006	0.006	0.007	0.007	0.007	0.007	0.008	0.010	0.008	0.010	0.009	0.009	0.009	0.009	0.012	0.010
	308	0.013	0.012	0.011	0.012	0.011	0.011	0.011	0.010	0.009	0.008	0.007	0.007	0.007	0.007	0.007	0.006	0.007	0.008	0.007	0.010	0.008	0.008	0.008	0.009	0.009	0.009	0.010	0.011	0.010	0.011	0.010	0.012	0.012	0.012	0.013	0.012
	304	0.026	0.025	0.023	0.023	0.021	0.021	0.019	0.018	0.015	0.013	0.010	0.009	0.009	0.009	0.008	0.009	0.008	0.008	0.009	0.013	0.010	0.010	0.010	0.011	0.011	0.011	0.012	0.013	0.012	0.013	0.013	0.013	0.013	0.014	0.015	0.015
	300	0.056	0.054	0.051	0.048	0.045	0.042	0.039	0.034	0.028	0.023	0.018	0.016	0.014	0.013	0.013	0.012	0.013	0.013	0.013	0.015	0.014	0.015	0.014	0.015	0.015	0.016	0.017	0.016	0.016	0.015	0.016	0.015	0.016	0.017	0.016	0.017
	296	0.104	0.102	0.098	0.091	0.084	0.079	0.072	0.064	0.053	0.044	0.037	0.032	0.028	0.026	0.026	0.025	0.025	0.026	0.025	0.026	0.027	0.027	0.027	0.028	0.027	0.027	0.028	0.026	0.025	0.021	0.020	0.021	0.020	0.020	0.021	0.022
	292	0.166	0.162	0.156	0.147	0.137	0.129	0.119	0.107	0.092	0.079	0.069	0.063	0.058	0.056	0.055	0.055	0.054	0.055	0.055	0.058	0.056	0.057	0.057	0.056	0.055	0.055	0.054	0.050	0.047	0.040	0.037	0.037	0.035	0.035	0.036	0.036
	288	0.234	0.229	0.223	0.214	0.202	0.192	0.180	0.167	0.149	0.135	0.124	0.116	0.110	0.109	0.109	0.107	0.107	0.108	0.108	0.108	0.109	0.110	0.110	0.110	0.108	0.107	0.104	0.100	0.094	0.085	0.081	0.081	0.079	0.078	0.077	0.079
	284	0.306	0.302	0.296	0.287	0.274	0.263	0.253	0.238	0.218	0.203	0.192	0.185	0.180	0.179	0.178	0.177	0.178	0.178	0.179	0.179	0.181	0.181	0.183	0.181	0.179	0.178	0.175	0.172	0.169	0.163	0.160	0.158	0.157	0.156	0.156	0.157
	280	0.379	0.376	0.372	0.364	0.352	0.343	0.331	0.317	0.300	0.285	0.274	0.269	0.264	0.262	0.263	0.262	0.263	0.264	0.264	0.266	0.268	0.269	0.270	0.270	0.270	0.270	0.269	0.270	0.270	0.270	0.270	0.270	0.271	0.270	0.270	0.271
	276	0.449	0.448	0.444	0.437	0.428	0.421	0.410	0.397	0.382	0.371	0.361	0.356	0.354	0.353	0.352	0.353	0.354	0.355	0.356	0.358	0.361	0.362	0.362	0.364	0.366	0.368	0.371	0.375	0.380	0.390	0.392	0.395	0.396	0.399	0.399	0.399
	272	0.520	0.518	0.516	0.508	0.501	0.496	0.487	0.477	0.464	0.457	0.449	0.447	0.445	0.446	0.446	0.445	0.447	0.449	0.450	0.451	0.456	0.456	0.457	0.458	0.462	0.469	0.471	0.478	0.487	0.503	0.508	0.510	0.514	0.516	0.517	0.517
	268	0.584	0.582	0.578	0.574	0.567	0.566	0.559	0.550	0.542	0.537	0.531	0.532	0.531	0.532	0.533	0.534	0.536	0.538	0.538	0.539	0.545	0.546	0.547	0.548	0.553	0.557	0.563	0.570	0.580	0.598	0.604	0.608	0.612	0.614	0.614	0.615
	264	0.632	0.630	0.626	0.623	0.616	0.617	0.613	0.607	0.602	0.601	0.599	0.601	0.601	0.602	0.605	0.606	0.608	0.611	0.611	0.613	0.618	0.619	0.620	0.622	0.628	0.627	0.633	0.640	0.649	0.664	0.670	0.674	0.678	0.679	0.680	0.680
	260	0.663	0.661	0.657	0.655	0.650	0.653	0.652	0.650	0.648	0.650	0.649	0.651	0.654	0.656	0.658	0.660	0.662	0.665	0.667	0.668	0.673	0.676	0.674	0.673	0.677	0.677	0.680	0.684	0.688	0.699	0.701	0.704	0.707	0.708	0.709	0.708
	256	0.673	0.672	0.671	0.669	0.668	0.674	0.673	0.670	0.672	0.675	0.676	0.680	0.683	0.685	0.687	0.690	0.692	0.696	0.697	0.698	0.703	0.704	0.705	0.702	0.702	0.702	0.701	0.702	0.702	0.707	0.708	0.709	0.710	0.712	0.712	0.710
	252	0.638	0.639	0.640	0.642	0.643	0.651	0.652	0.651	0.653	0.658	0.661	0.666	0.670	0.671	0.675	0.677	0.680	0.683	0.684	0.685	0.691	0.692	0.692	0.689	0.688	0.685	0.684	0.682	0.680	0.680	0.681	0.680	0.680	0.680	0.681	0.680
Wavelength(nm)	240 244 248	0.382 0.479 0.571	0.386 0.482 0.573	0.391 0.488 0.576	0.402 0.496 0.582	0.409 0.503 0.586	0.424 0.515 0.597	0.430 0.519 0.599	0.435 0.522 0.599	0.444 0.528 0.604	0.454 0.537 0.611	0.462 0.544 0.615	0.470 0.550 0.623	0.476 0.555 0.626	0.480 0.559 0.628	0.483 0.562 0.631	0.484 0.566 0.635	0.487 0.566 0.636	0.490 0.570 0.639	0.492 0.571 0.641	0.494 0.572 0.642	0.500 0.578 0.647	0.501 0.580 0.648	0.503 0.580 0.649	0.503 0.579 0.645	0.506 0.578 0.644	0.509 0.578 0.642	0.511 0.576 0.639	0.515 0.575 0.638	0.516 0.573 0.634	0.520 0.573 0.632	0.522 0.573 0.631	0.523 0.573 0.631	0.524 0.573 0.631	0.526 0.573 0.631	0.527 0.575 0.631	0.529 0.576 0.631

	pH	2.15	2.24	2.44	2.68	3.00	3.25	3.47	3.72	4.04	4.40	4.77	5.06	5.40	5.68	5.98	6.25	6.49	6.85	7.00	7.47	7.75	7.96	8.12	8.51	8.82	9.11	9.38	9.61	9.89	10.38	10.57	10.74	11.01	11.27	11.47	11.64

406	CHEMOMETRICS

−1 cmin

alongSamples wavelengthrows,

6.6Problem

1616	5859	6503	5062	6563	3027	7995	4825	1216	363	201	316	314	448	465	387	816	684	663	502	674	409	910	495
1632	840	688	738	834	756	1110	811	1232	339	181	270	277	406	431	333	832	2919	3472	2182	3198	951	3987	1891
1648	488	271	325	370	525	617	399	1248	432	225	319	338	507	546	397	848	1556	1802	1180	1691	582	2120	1049
1664	378	224	265	300	399	481	319	1264	397	223	267	304	426	503	326	864	274	186	229	248	287	355	266
1680	343	221	285	304	374	444	337	1280	356	214	248	283	371	454	298	880	290	186	260	267	331	376	308
1696	359	252	303	330	368	468	349	1296	382	248	284	322	387	490	333	896	382	286	351	375	391	503	399
1712	647	521	461	582	513	844	497	1312	434	272	326	363	456	557	387	912	420	316	362	401	408	551	408
1728	472	318	379	418	485	611	440	1328	457	294	332	380	461	586	390	928	553	466	468	546	468	732	506	528	598	366	658	606	798	784	791
1744	617	514	487	588	502	812	524	1344	585	432	432	516	524	759	484	944	343	228	289	310	367	444	339	544	625	322	538	527	797	797	670
1760	657	581	475	618	459	866	491	1360	771	605	600	713	669	1008	661	960	410	215	300	320	479	519	374	560	554	265	451	444	713	701	571
1776	425	215	336	341	524	539	421	1376	1453	1117	1141	1337	1299	1898	1268	976	718	353	447	507	808	898	568	576	347	210	357	338	447	453	430
1792	426	222	495	428	639	556	612	1392	1810	1245	1377	1575	1771	2338	1586	992	959	558	578	707	952	1209	698	592	287	149	288	264	395	370	357
1808	398	218	472	409	590	521	579	1408	1153	853	866	1025	1042	1497	971	1008	936	595	585	725	881	1190	686	608	258	163	239	240	304	335	285
1824	312	197	272	282	355	404	324	1424	1090	720	879	961	1143	1409	1029	1024	887	507	982	885	1235	1159	1196	624	230	134	216	211	287	297	262
1840	447	237	368	371	549	570	455	1440	2429	1515	1755	1992	2505	3106	2080	1040	988	751	1143	1099	1174	1323	1313	640	240	155	222	225	277	312	263
1856	421	224	412	384	566	542	508	1456	4086	2635	2862	3343	4036	5229	3354	1056	839	650	830	865	873	1112	938	656	264	169	252	251	313	344	299
1872	382	247	424	395	495	503	504	1472	5282	2844	3345	3874	5666	6642	4140	1072	972	506	994	904	1355	1255	1232	672	1091	972	1376	1340	1200	1493	1529
1888	354	246	265	307	340	458	304	1488	5298	3195	4358	4573	6011	6806	5233	1088	1462	579	1151	1094	2007	1828	1509	688	2593	2363	3207	3175	2737	3550	3533
1904	456	259	285	339	469	576	347	1504	3339	2599	3724	3669	3785	4468	4243	1104	1091	434	696	732	1371	1349	927	704	2229	1071	3565	2710	4201	2982	4432
1920	531	301	374	416	578	673	455	1520	2318	2051	2006	2356	1869	3085	2134	1120	390	223	334	338	468	500	406	720	2709	1089	5042	3568	5921	3659	6342
1936	422	262	484	440	574	556	581	1536	1956	1739	1662	1976	1543	2602	1759	1136	329	207	258	282	353	424	306	736	2742	1084	4946	3520	5896	3686	6235
1952	381	241	489	427	552	508	587	1552	1617	1412	1281	1573	1237	2139	1352	1152	356	213	271	295	389	455	326	752	3372	1130	2592	2396	4846	4180	3492
1968	256	167	258	252	310	336	305	1568	1507	1282	1163	1433	1173	1985	1236	1168	384	216	290	310	435	488	354	768	4468	1404	1924	2336	5384	5384	2821
1984	214	145	169	189	217	277	197	1584	1936	1407	1355	1657	1707	2501	1520	1184	352	194	270	285	407	448	331	784	1238	452	625	725	1482	1510	867
2000	211	143	164	184	211	272	190	1600	4251	4053	3587	4408	2983	5693	3678	1200	372	218	314	322	437	478	381	800	296	171	220	240	327	377	267
	1 2 3 4 5 6 7								1 2 3 4 5 6 7								1 2 3 4 5 6 7								1 2 3 4 5 6 7

EVOLUTIONARY SIGNALS	407

1.Scale the data so that the sum of the spectral intensities at each wavelength equals 1 (note that this differs from the usual method which is along the rows, and is a way of putting equal weight on each wavelength). Perform PCA, without further preprocessing, and produce a plot of the loadings of PC2 vs PC1.

2.Many wavelengths are not very useful if they are low intensity. Identify those wavelengths for which the sum over all seven spectra is greater than 10 % of the wavelength that has the maximum sum, and label these in the graph in question 1.

3.Comment on the appearance of the graph in question 2, and suggest three wavelengths that are typical of each of the compounds.

4.Using the three wavelengths selected in question 3, obtain a 7 × 3 matrix of relative

	ˆ
concentrations in each of the spectra and call this C .
5. Calling the original data	X, obtain the estimated spectra for each compound by
S = (Cˆ .Cˆ )−1.Cˆ .X and	plot these graphically.

Chemometrics: Data Analysis for the Laboratory and Chemical Plant.

Richard G. Brereton

ISBNs: 0-471-48977-8 (HB); 0-471-48978-6 (PB)

Appendices

A.1 Vectors and Matrices

A.1.1 Notation and Deﬁnitions

A single number is often called a scalar, and is represented by italics, e.g. x.

A vector consists of a row or column of numbers and is represented by bold lower case italics, e.g. x. For example, x = 3 −11 9 0 is a row vector and

5.6 y = 2.8

1.9

is a column vector.

A matrix is a two-dimensional array of numbers and is represented by bold upper case italics e.g. X. For example,

			3
X =	12		3	8
X =	−	2	14	1
	−

is a matrix.

The dimensions of a matrix are normally presented with the number of rows ﬁrst and the number of columns second, and vectors can be considered as matrices with one dimension equal to 1, so that x above has dimensions 1 × 4 and X has dimensions 2 × 3.

A square matrix is one where the number of columns equals the number of rows. For example,

		−7	4	−1
Y		11	3	6
	=	2	−4	−12

is a square matrix.

An identity matrix is a square matrix whose elements are equal to 1 in the diagonal and 0 elsewhere, and is often denoted by I. For example,


I =	1	0
I =	0	1

is an identity matrix.

The individual elements of a matrix are often referenced as scalars, with subscripts referring to the row and column; hence, in the matrix above, y21 = 11, which is the element in row 2 and column 1. Optionally, a comma can be placed between the subscripts for clarity; this is useful if one of the dimensions exceeds 9.

410	CHEMOMETRICS

A.1.2 Matrix and Vector Operations

A.1.2.1 Addition and Subtraction

Addition and subtraction is the most straightforward operation. Each matrix (or vector) must have the same dimensions, and simply involves performing the operation element by element. Hence

	8	4	11	−3		19	7
	9	7	0	7		9	0
−−2		4 + − 5		6	= − 3		10

A.1.2.2 Transpose

Transposing a matrix involves swapping the columns and rows around, and may be denoted by a right-hand-side superscript ( ). For example, if


Z =	3.1 0.2 6.1 4.8
Z =	9.2 3.8 2.0 5.1
then	Z		0.2	3.8
		=	3.1	9.2
		=	4.8	5.1
			6.1	2.0

Some authors used a superscript T instead.

A.1.2.3 Multiplication

Matrix and vector multiplication using the ‘dot’ product is denoted by the symbol ‘.’ between matrices. It is only possible to multiply two matrices together if the number of columns of the ﬁrst matrix equals the number of rows of the second matrix. The number of rows of the product will equal the number of rows of the ﬁrst matrix, and the number of columns equal the number of columns of the second matrix. Hence a 3 × 2 matrix when multiplied by a 2 × 4 matrix will give a 3 × 4 matrix.

Multiplication of matrices is not commutative, that is, generally A.B =B.A even if the second product is allowable. Matrix multiplication can be expressed in the form of summations. For arrays with more than two dimensions (e.g. tensors), conventional symbolism can be awkward and it is probably easier to think in terms of summations.

If matrix A has dimensions I × J and matrix B has dimensions J × K, then the product C of dimensions I × K has elements deﬁned by

						cik=	aij bj k
							j =1
Hence	9	3		0	1	8	5		54	93	123	42
	9	3		0	1	8	5		54	93	123	42
	1	7	·		10	11		=	6	17	67	38
	2	5	·					=	12	25	62	31
				6			3

APPENDICES	411

To illustrate this, the element of the ﬁrst row and second column of the product is given by 17 = 1 × 10 + 7 × 1.

When several matrices are multiplied together it is normal to take any two neighbouring matrices, multiply them together and then multiply this product with another neighbouring matrix. It does not matter in what order this is done, hence A.B.C = (A.B).C = A.(B.C ). Hence matrix multiplication is associative. Matrix multiplication is also distributive, that is, A.(B + C ) = A.B + A.C .

A.1.2.4 Inverse

Most square matrices have inverses, deﬁned by the matrix which when multiplied with the original matrix gives the identity matrix, and is represented by a −1 as a right- hand-side superscript, so that D.D−1 = I . Note that some square matrices do not have inverses: this is caused by there being correlations in the original matrix; such matrices are called singular matrices.

A.1.2.5 Pseudo-inverse

In several sections of this text we use the idea of a pseudo-inverse. If matrices are not square, it is not possible to calculate an inverse, but the concept of a pseudo-inverse

exists and is employed in regression analysis.

If A = B.C then B .A = B .B.C , so (B .B)−1.B .A = C and (B .B)−1.B is said

to be the left pseudo-inverse of B.

Equivalently, A.C = B.C .C , so A.C .(C .C )−1 = B and C .(C .C )−1 is said to be the right pseudo-inverse of C.

In regression, the equation A ≈ B.C is an approximation; for example, A may represent a series of spectra that are approximately equal to the product of two matrices such as scores and loadings matrices, hence this approach is important to obtain the best ﬁt model for C knowing A and B or for B knowing A and C.

A.1.2.6 Trace and Determinant

Other properties of square matrices sometimes encountered are the trace, which is the sum of the diagonal elements, and the determinant, which relates to the size of the matrix. A determinant of 0 indicates a matrix without an inverse. A very small determinant often suggests that the data are fairly correlated or a poor experimental design resulting in fairly unreliable predictions. If the dimensions of matrices are large and the magnitudes of the measurements are small, e.g. 10−3, it is sometimes possible to obtain a determinant close to zero even though the matrix has an inverse; a solution to this problem is to multiply each measurement by a number such as 103 and then remember to readjust the magnitude of the numbers in resultant calculations to take account of this later.

A.1.2.7 Vector length

An interesting property that chemometricians sometimes use is that the product of the transpose of a column vector with itself equals the sum of square of elements of the vector, so that x .x = x2. The length of a vector is given by (x .x) = √ x2 or

412	CHEMOMETRICS

the square root of the sum of its elements. This can be visualised in geometry as the length of the line from the origin to the point in space indicated by the vector.

A.2 Algorithms

There are many different descriptions of the various algorithms in the literature. This Appendix describes one algorithm for each of four regression methods.

A.2.1 Principal components analysis

NIPALS is a common, iterative algorithm often used for PCA. Some authors use another method called SVD (singular value decomposition). The main difference is that NIPALS extracts components one at a time, and can be stopped after the desired number of PCs has been obtained. In the case of large datasets with, for example, 200 variables (e.g. in spectroscopy), this can be very useful and reduce the amount of effort required. The steps are as follows.

Initialisation

1.Take a matrix Z and, if required, preprocess (e.g. mean centre or standardise) to give the matrix X which is used for PCA.

New Principal Component

2.Take a column of this matrix (often the column with greatest sum of squares) as the ﬁrst guess of the scores ﬁrst principal component; call it initial tˆ.

Iteration for each Principal Component

3. Calculate

unnorm pˆ = initial tˆ .X

tˆ2

4. Normalise the guess of the loadings, so

unnorm pˆ

pˆ =

unnorm pˆ2

5. Now calculate a new guess of the scores:

new tˆ = X.pˆ

Check for Convergence

6. Check if this new guess differs from the ﬁrst guess; a simple approach is to

look at the size of the sum of square difference in the old and new scores, i.e.(initial tˆ − new tˆ)2. If this is small the PC has been extracted, set the PC scores

(t) and loadings (p) for the current PC to tˆ and pˆ . Otherwise, return to step 3, substituting the initial scores by the new scores.

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 4142 / 5042 43 44 45 46 47 48 49 50 > Следующая >>>

Соседние файлы в предмете Химия

#
15.08.20134.29 Mб17Baer M., Billing G.D. (eds.) - The role of degenerate states in chemistry (Adv.Chem.Phys. special issue, Wiley, 2002).pdf
#
15.08.20137.08 Mб55Basov N.I. i dr. Raschet i konstruirovanie formiruyushchego instrumenta dlya izgotovleniya izdelij (1991.pdf
#
15.08.20135.59 Mб68Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
#
15.08.2013324.82 Кб32benzyne-cyclization.pdf
#
15.08.201314.48 Mб18Borowko M. 2000 Computational methods in surface and colloid science.djvu
#
15.08.20134.3 Mб48Brereton Chemometrics.pdf
#
15.08.20131.07 Mб30Burshtejn K.Ya., Shorygin P.P. Kvantovohimicheskie raschety v organicheskoj himii i molekulyarnoj.pdf
#
15.08.201321.36 Mб45Carey F.A. - Organic Chemistry (2004)(en).djvu
#
15.08.201321.36 Mб39Carey F.A. Advanced organic chemistry 5ed., MGH, 2004.djvu
#
15.08.201311.62 Mб23Carey F.A. Advanced organic chemistry. Part A structure and mechanisms 1938.djvu
#
15.08.20138.77 Mб17Carey F.A. Advanced organic chemistry. Part B reaction and synthesis 1938.djvu