Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Казанский национальный исследовательский технологический университет

Предмет:

Химия

Файл:

Brereton Chemometrics

.pdf

Скачиваний:

Добавлен:

15.08.2013

Размер:

4.3 Mб

Скачать

☆

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 4849 / 5049 50 > Следующая >>>

APPENDICES	473

Figure A.46

Scatterplot in Matlab

identical number of elements. There are various ways of telling Matlab that a variable is a string (or character) rather than numeric variable. Any data surrounding by single quotes is treated as a string, so the array c = [‘a’; ‘b’; ‘c’] will be treated by Matlab as a 3 × 1 character array. Figure A.47 illustrates the use of this method. Note that in order to prevent the labels from overlapping with the points in the graph, leaving one or two spaces before the actual text helps. It is possible to move the labels later in the graph editor if there is still some overlap.

Sometimes the labels are originally in a numerical format, for example they may consist of points in time or wavelengths. For Matlab to recognise this, the numbers can be converted to strings using the num2str function. An example is given in Figure A.48, where the ﬁrst column of the matrix consists of the numbers 10, 15 and 20 which may represent times, the aim being to plot the second against the third column and use the ﬁrst for labelling. Of course, any array can contain the labels.

A.5.7.5 Three-dimensional Graphics

Matlab can be very useful for the representation of data in three dimensions, in contrast to Excel where there are no straightforward 3D functions. In Chapter 6 we used 3D scores and loadings plots.

Consider a scores matrix of dimensions 36 × 3 (T) and a loadings matrix of dimensions 3 × 25 (P). The command plot3(T(:,1),T(:,2),T(:,3)) produces a graph of all three columns against one another; see Figure A.49. Often the default

474	CHEMOMETRICS

Figure A.47

Use of text command in Matlab

orientation is not the most informative for our purposes, and we may wish to change this. There are a huge number of commands in Matlab to do this, which is a big bonus for the enthusiast, but for the ﬁrst time user the easiest is to select the right-hand rotation icon, and interactively change the view; see Figure 4.50. If that is the desired view, leave go of the icon.

Often we want to return to the view, and a way of keeping the same perspective is via the view command. Typing A = view will keep this information in a 4 × 4 matrix A. Enthusiasts will be able to interpret these in fundamental terms, but it is

APPENDICES	475

Figure A.48

Using numerical to character conversion for labelling of graphs

not necessary to understand this when ﬁrst using 3D graphics in Matlab. However, in chemometrics we often wish to look simultaneously at 3D scores and loadings plots and it is important that both have identical orientations. The way to do this is to ensure that the loadings have the same orientation as the scores. The commands

figure(2)

plot3(P(:,1),P(:,2),P(:,3))

view(A)

should place a loadings plot with the same orientation in Figure 2. Sometimes this does not always work the ﬁrst time; the reasons are rather complicated and depend on

476	CHEMOMETRICS

Figure A.49

A 3D scores plot

Figure A.50

Using the rotation icon

APPENDICES	477

Figure A.51

Scores and loadings plots with identical orientations

478	CHEMOMETRICS

the overall starting orientation, but it is usually easy to see when it has succeeded. If you are in a mess, start again from scratch. Scores and loadings plots with the same orientation are presented in Figure A.51.

The experienced user can improve these graphs just as the 2D graphs, for example by labelling axes or individual points, using symbols in addition to or as an alternative to joining using a line. The scatter3 statement has similar properties to plot3.

Chemometrics: Data Analysis for the Laboratory and Chemical Plant.

Richard G. Brereton

ISBNs: 0-471-48977-8 (HB); 0-471-48978-6 (PB)

Index

Note: Figures and tables are indicated by italic page numbers

agglomerative clustering	227
Alchemist (e-zine) 11
algorithms
partial least squares	413–17
principal components	analysis 412–13

analogue-to-digital converter (ADC), and digital

resolution	128
analysis of variance (ANOVA)		24–30
with F -test	42
analytical chemists, interests		2–3, 5
analytical error	21

application scientists, interest in chemometrics 3, 4–5

auto-correlograms 142–5

automation, resolution needed due to 387 autoprediction error 200, 313–15 autoregressive moving average (ARMA) noise

129–31

autoregressive component 130 moving average component 130

autoscaling 356

average linkage clustering 228

backward expanding factor analysis 376 base peaks, scaling to 354–5

baseline correction 341, 342 Bayesian classiﬁcation functions 242 Bayesian statistics 4, 169

biplots 219–20

C programming language, use of programs in Excel 446

calibration 271–338 case study 273, 274–5 history 271

and model validation 313–23 multivariate 271

problems on 323–38 terminology 273, 275 univariate 276–84 usage 271–3

calibration designs 69–76 problem(s) on 113–14 uses 76

canonical variates analysis 233

Cauchy distribution, and Lorentzian peakshape 123

central composite designs					76–84
axial (or star) points in					77, 80–3
degrees of freedom for					79–80
and modelling		83
orthogonality		80–1, 83
problem(s) on		106–7, 115–16
rotatability	80, 81–3
setting up of		76–8
and statistical factors					84
centring, data scaling by					212–13
chemical engineers, interests						2, 6
chemical factors, in PCA					191–2
chemists, interests 2
chemometricians, characteristics							5
chemometrics
people interested in				1, 4–6
reading recommendations						8–9
relationship to other disciplines							3
Chemometrics and Intelligent Laboratory Systems
(journal)	9
Chemometrics World (Internet resource) 11
chromatography
digitisation of data				126
principal components analysis applications
column performance					186, 189, 190
resolution of overlapping peaks								186, 187,
188
signal processing for					120, 122
class distance plots			235–6, 239, 241
class distances		237, 239
in SIMCA	245
class modelling		243–8
problem(s) on		265–6
classical calibration			276–9
compared with inverse calibration								279–80,
280, 281
classiﬁcation
chemist’s need for				230
see also supervised pattern recognition
closure, in row scaling				215
cluster analysis		183, 224–30
compared with supervised pattern recognition
230
graphical representation of results
229–30
linkage methods			227–8
next steps	229

480	INDEX

cluster analysis (continued)
problem(s) on		256–7
similarity measures			224–7
coding of data, in signiﬁcance testing								37–9
coefﬁcients of model			19
determining	33–4, 55
column scaling, data preprocessing by								356–60
column vector	409
composition
determining	365–86
by correlation based methods							372–5
by derivatives		380–6
by eigenvalue based methods							376–80
by similarity based methods						372–6
by univariate methods 367–71
meaning of term 365–7
compositional mixture experiments							84
constrained mixture designs					90–6
lower bounds speciﬁed				90–1, 91
problem(s) on		110–11
upper bounds speciﬁed				91–3, 91
upper and lower bounds speciﬁed							91, 93
with additional factor added as ﬁller 91, 93
constraints
experimental design affected by							90–6
and resolution		396, 398
convolution 119, 138, 141, 162–3
convolution theorem			161–3
Cooley–Tukey algorithm				147
correlated noise 129–31
correlation coefﬁcient(s)				419
in cluster analysis			225
composition determined by					372–5
problem(s) on		398, 404
in design matrix		56
Excel function for calculating						434
correlograms	119, 142–7
auto-correlograms 142–5
cross-correlograms 145–6
multivariate correlograms					146–7
problem(s) on		175–6, 177–8
coupled chromatography
amount of data generated					339
matrix representation of data						188, 189
principal components based plots							342–50
scaling of data		350–60
variable selection for				360–5
covariance, meaning of term					418–19
Cox models	87
cross-citation analysis			1
cross-correlograms 145–6
problem(s) on		175–6
cross-validation
limitations	317
in partial least squares				316–17
problem(s) on		333–4
in principal components analysis							199–204

Excel implementation 452
problem(s) on 267, 269
in principal components regression	315–16
purposes 316–17
in supervised pattern recognition	232, 248

cumulative standardised normal distribution 420,

421
data compression, by wavelet transforms								168
data preprocessing/scaling						210–18
by column scaling					356–60
by mean centring				212–13, 283, 307, 309, 356
by row scaling			215–17, 350–5
by standardisation					213–15, 309, 356
in Excel		453
in Matlab		464–5
datasets	342
degrees of freedom
basic principles			19–23
in central composite design							79–80
dendrograms		184, 229–30
derivatives		138
composition determined by							380–6
problem(s) on				398, 401, 403–4
of Gaussian curve					139
for overlapping peaks						138, 140
problem(s) on			179–80
Savitsky–Golay method for calculating								138,
141
descriptive statistics				417–19
correlation coefﬁcient						419
covariance		418–19
mean	417–18
standard deviation					418
variance		418
design matrices and modelling							30–6
coding of data			37–9
determining the model						33–5
for factorial designs					55
matrices		31–3
models	30–1
predictions 35–6
problem(s) on			102
determinant (of square matrix)							411
digital signal processing (DSP), reading
recommendations					11
digitisation of data				125–8
effect on digital resolution							126–8
problem(s) on			178–9
discrete Fourier transform (DFT) 147
and sampling rates					154–5
discriminant analysis					233–42
extension of method					242
and Mahalanobis distance							236–41
multivariate models					234–6
univariate classiﬁcation						233–4

INDEX	481

discriminant partial least squares (DPLS) method 248–9

distance measures 225–7 problem(s) on 257, 261–3

see also Euclidean...; Mahalanobis...; Manhattan distance measure

dot product 410

double exponential (Fourier) ﬁlters 158, 160–1 dummy factors 46, 68

eigenvalue based methods, composition determined

by	376–80
eigenvalues		196–9
eigenvectors		193
electronic absorption spectroscopy (EAS)
calibration for			272, 284
case study			273, 274–5
experimental design					19–23
see also UV/vis spectroscopy
embedded peaks			366, 367, 371
determining proﬁles of						395
entropy
deﬁnition		171
see also maximum entropy techniques
environmental processes, time series data									119
error, meaning of term					20
error analysis		23–30
problem(s) on			108–9
Euclidean distance measure							225–6, 237
problem(s) on			257, 261–3
evolutionary signals				339–407
problem(s) on			398–407
evolving factor analysis (EFA) 376–8
problem(s) on			400
Excel	7, 425–56
add-ins 7, 436–7
for linear regression						436, 437
for multiple linear regression								7, 455–6
for multivariate analysis							7, 449, 451–6
for partial least squares							7, 454–5
for principal components analysis 7, 451–2
for principal components regression									7,
	453–4
systems requirements						7, 449
arithmetic functions of ranges and matrices
	433–4
arithmetic functions of scalars								433
AVERAGE function					428
cell addresses
alphanumeric format						425
invariant		425
numeric format				426–7
chart facility			447, 448, 449, 450
labelling of datapoints							447
compared with Matlab						8, 446
copying cells or ranges						428, 429–30
CORREL function					434

equations and functions			430–6
FDIST function	42, 435
ﬁle referencing	427
graphs produced by 447, 448, 449, 450
logical functions	435
macros
creating and editing		440–5
downloadable	7, 447–56
running 437–40
matrix operations	431–3
MINVERSE function			432, 432
MMULT function		431, 432
TRANSPOSE function			431, 432
names and addresses		425–30
naming matrices or vectors 430, 431

nesting and combining functions and equations

435–6
NORMDIST function				435
NORMINV function				45, 435
ranges of cells 427–8
scalar operations			430–1
statistical functions			435
STDEV/STDEVP functions						434
TDIST function		42, 435
VAR/VARP functions				434
Visual Basic for Applications (VBA)							7, 437,
445–7
worksheets
maximum size			426
naming 427
experimental design			15–117
basic principles		19–53
analysis of variance				23–30
degrees of freedom				19–23
design matrices and modelling 30–6
leverage and conﬁdence in models							47–53
signiﬁcance testing				36–47
central composite/response surface designs
76–84
factorial designs		53–76
fractional factorial designs						60–6
full factorial designs				54–60
partial factorials at several levels							69–76
Plackett–Burman designs						67–9
Taguchi designs			69
introduction	15–19
mixture designs		84–96
constrained mixture designs 90–6
simplex centroid designs						85–8
simplex lattice designs					88–90
with process variables					96
problems on	102–17
calibration designs				113–14
central composite designs						106–7, 115–16
design matrix		102
factorial designs			102–3, 105–6, 113–14
mixture designs			103–4, 110–11, 113,
114–15, 116–17

482	INDEX

experimental design (continued)
principal components analysis 111–13
signiﬁcance testing		104–5
simplex optimisation			107–8
reading recommendations			10
simplex optimisation		97–102
elaborations	99
ﬁxed sized simplex		97–9
limitations	101–2
modiﬁed simplex		100–1
terminology	275
experimental error 21–2
estimating 22–3, 77
exploratory data analysis (EDA) 183
baseline correction 341, 342

compared with unsupervised pattern recognition 184

data preprocessing/scaling for 350–60

principal component based plots									342–50
variable selection					360–5
see also factor analysis; principal components
	analysis
exponential (Fourier) ﬁlters								156, 157
double		158, 160–1
F distribution				421–4
one-tailed			422–3
F-ratio		30, 42, 43
F-test	42–3, 421
with ANOVA				42
face centred cube design							77
factor, meaning of term							19
factor analysis (FA)					183, 204–5
compared with PCA							185, 204
see also evolving factor analysis; PARAFAC
	models; window factor analysis
factorial designs				53–76
four-level			60
fractional			60–6
examples of construction								64–6
matrix of effects						63–4
problem(s) on					102–3
full	54–60
problem(s) on					105–6
Plackett–Burman designs								67–9
problem(s) on					109–10
problems on				102–3, 105–6, 109–10
Taguchi designs					69
three-level			60
two-level 54–9
design matrices for							55, 62
disadvantages					59, 60
and normal probability plots									43
problem(s) on					102, 102–3, 105–6
reduction of number of experiments 61–3
uses		76
two-level fractional						61–6
disadvantages					66

half factorial designs						62–5
quarter factorial designs							65–6
fast Fourier transform (FFT)							156
ﬁller, in constrained mixture design									93
Fisher, R. A. 36, 237
Fisher discriminant analysis							233
ﬁxed sized simplex, optimisation using										97–9
ﬁxed sized window factor analysis									376, 378–80
ﬂow injection analysis (FIA), problem(s) on											328
forgery, detection of				184, 211, 237, 251
forward expanding factor analysis									376
Fourier deconvolution				121, 156–61
Fourier ﬁlters 156–61
exponential ﬁlters				156, 157
inﬂuence of noise				157–61
Fourier pair	149
Fourier self-deconvolution						121, 161
Fourier transform algorithms							156
Fourier transform techniques							147–63
convolution theorem					161–3
Fourier ﬁlters		156–61
Fourier transforms				147–56
problem(s) on		174–5, 180–1
Fourier transforms			120–1, 147–56
forward	150–1
general principles				147–50
inverse	151, 161
methods	150–2
numerical example				151–2
reading recommendations							11
real and imaginary pairs							152–4
absorption lineshape						152, 153
dispersion lineshape						152, 153
and sampling rates				154–6
fractional factorial designs							60–6
in central composite designs								77
problem(s) on		102–3
freedom, degrees of see degrees of freedom
frequency domains, in NMR spectroscopy											148
full factorial designs				54–60
in central composite designs								77
problem(s) on		105–6
furthest neighbour clustering							228
gain vector	164
Gallois ﬁeld theory			2
Gaussians	123
compared with Lorentzians								124
derivatives of		139
in frequency and time domains									149
generators (in factorial designs)								67
geological processes, time series data										119
graphical representation
cluster analysis results						229–30
Excel facility		447, 448, 450
Matlab facility		469–78
principal components					205–10

<<< < Предыдущая 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 4849 / 5049 50 > Следующая >>>

Соседние файлы в предмете Химия

#
15.08.20134.29 Mб17Baer M., Billing G.D. (eds.) - The role of degenerate states in chemistry (Adv.Chem.Phys. special issue, Wiley, 2002).pdf
#
15.08.20137.08 Mб55Basov N.I. i dr. Raschet i konstruirovanie formiruyushchego instrumenta dlya izgotovleniya izdelij (1991.pdf
#
15.08.20135.59 Mб68Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
#
15.08.2013324.82 Кб32benzyne-cyclization.pdf
#
15.08.201314.48 Mб18Borowko M. 2000 Computational methods in surface and colloid science.djvu
#
15.08.20134.3 Mб48Brereton Chemometrics.pdf
#
15.08.20131.07 Mб30Burshtejn K.Ya., Shorygin P.P. Kvantovohimicheskie raschety v organicheskoj himii i molekulyarnoj.pdf
#
15.08.201321.36 Mб45Carey F.A. - Organic Chemistry (2004)(en).djvu
#
15.08.201321.36 Mб39Carey F.A. Advanced organic chemistry 5ed., MGH, 2004.djvu
#
15.08.201311.62 Mб23Carey F.A. Advanced organic chemistry. Part A structure and mechanisms 1938.djvu
#
15.08.20138.77 Mб17Carey F.A. Advanced organic chemistry. Part B reaction and synthesis 1938.djvu