Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Brereton Chemometrics

.pdf
Скачиваний:
48
Добавлен:
15.08.2013
Размер:
4.3 Mб
Скачать

342

CHEMOMETRICS

 

 

Baseline Peaks

Figure 6.2

Dividing data into regions prior to baseline correction

6.2.2 Principal Component Based Plots

Scores and loadings plots have been introduced in Chapter 4 (Section 4.3.5). In this chapter we will explore some further properties, especially useful where one or both of the variables are related in sequence. Table 6.1 represents a two-way dataset, corresponding to HPLC–DAD, each elution time being represented by a row and each measurement (such as successive wavelengths) by a column, giving a 25 × 12 data matrix, which will be called dataset A. The data represent two partially overlapping chromatographic peaks. The profile (sum of intensity over the spectrum at each elution time) is presented in Figure 6.3.

 

14

 

 

 

 

 

 

 

12

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

8

 

 

 

 

 

 

Intensity

6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

1

5

9

13

17

21

25

 

−2

 

 

Datapoint

 

 

 

Figure 6.3

Profile of data in Table 6.1

EVOLUTIONARY SIGNALS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

343

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

L

 

.0558

.0608

.048

.1521

.2164

.2212

.5579

.8741

.9038

.8434

.6739

.5869

.4386

.3938

.3639

.2178

.1021

.0979

.0307

.0308

.0017

.0653

.0072

.0853

.027

 

 

 

 

 

 

 

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

− −

 

 

 

 

K

 

0.0174

0.0355

0.0283

0.1371

0.2227

0.4225

0.5986

0.9435

0.9545

0.9744

0.8432

0.6549

0.4393

0.4327

0.3217

0.2212

0.0379

0.0518

0.0458

0.014

0.0065

0.0026

0.0579

0.0619

0.0237

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

J

 

0.0645 0.0015 0.0246

0.1275

0.1143

0.4269

0.5496

0.8334

0.8645

0.9224

0.7912

0.6313

0.5641

0.6646

0.4457

0.3343

0.1546

0.0491

0.0129 0.0342

0.0138

0.0186

0.0242 0.0236

0.003

 

 

I

 

0.032 0.0034

0.0293

0.0669

0.1024

0.341

0.529

0.5212

0.7138

0.7661

0.78

0.7796

0.959

0.9459

0.7724

0.5951

0.3231

0.1289

0.1577

0.0152

0.0222

0.0038

0.0072 0.0368

0.0156

 

 

H

 

0.0459

0.1377

0.0259

0.0587

0.1367

0.1325

0.3343

0.334

0.5695

0.5333

0.7855

1.0237

1.2283

1.2238

1.1237

0.7876

0.5677

0.2925

0.1047

0.1182

0.0975

0.0383

0.0342

0.0182

0.0263

 

 

 

 

 

− −

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

G

 

0.0518

0.0123

0.0612

0.0499 0.0015

0.1919

0.3703

0.3584

0.5764

0.7043

0.907

1.1164

1.3362

1.3713

1.3094

0.9616

0.582

0.3571

0.1721

0.0213

0.0218

0.0479

0.0263 0.0018 0.0137

 

 

F

 

.0336

.0377

.0528

.1912

.1575

.293

.3783

.6825

.7215

.793

.9552

.1321

.2339

.3175

.1592

.8509

.4634

.2974

.2454

.0468

.0053

.0716

.0507

.0946

.0236

 

 

 

 

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0

 

 

 

 

− −

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E

 

.0079

.072

.0386

.0185

.2383

.3234

.6054

.0843

.1767

.1986

.0619

.094

.9656

.9758

.7807

.5427

.2747

.1922

.0113

.0693

.0648

.067

.0199

.0572

.0291

 

 

 

 

0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

.0622

.0014

.0009

.1073

.3531

.7042

.0167

.3823

.5951

.5679

.254

.0496

.7349

.5837

.4609

.3332

.1721

.1622

.0239

.0564

.0405

.0533

.0052

.0046

.0395

 

 

 

 

0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

 

 

 

 

 

− −

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

− − − −

 

 

 

 

 

C

 

.0886

.0507

.1005

.1828

.4304

.7367

.3239

.6344

.9253

.5299

.2793

.8139

.5844

.3344

.169

.1684

.079

.0842

.0672

.0362

.0371

.0323

.0175

.0191

.0185

 

 

 

 

0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

− −

 

 

 

 

 

6.1 DatasetA.

A B

 

0.1102 0.0694 0.0487 0.0001

0.036 0.0277

0.2104 0.1564

0.1713 0.3206

0.497 0.6192

0.6753 1.1198

1.0412 1.5129

1.0946 1.5543

0.9955 1.4794

0.672 1.1315

0.469 0.7531

0.3113 0.3894

0.0891 0.2121

0.0567 0.1408 0.0391 0.0211 0.0895 0.0086 0.007 0.024 0.0146 0.0567 0.0012 0.0043 0.0937 0.0324 0.0031 0.0127

0.0387 0.0041 0.0449 0.0076 0.0986 0.0244

 

Table

 

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

344

 

 

 

 

 

 

 

 

 

 

CHEMOMETRICS

 

 

2

 

 

 

 

 

 

 

 

 

 

 

1.5

 

 

 

 

15

14

 

 

 

 

 

 

 

 

 

 

13

 

 

 

 

PC2

 

 

 

 

16

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.5

 

 

17

 

 

 

12

 

 

 

 

19

 

18

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

11 3.5

 

−0.5

 

0

4 0.5

1

1.5

2

2.5

3

4

 

 

−0.5

 

 

5

 

PC1

 

 

 

 

 

 

 

 

 

6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

 

7

 

 

 

 

−1

 

 

 

 

 

 

 

9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8

 

 

 

−1.5

 

 

 

 

 

 

 

 

 

 

 

0.6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

G

 

 

 

0.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.2

 

 

 

 

 

I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PC2

 

 

 

 

 

 

 

 

 

E

 

0

 

 

 

 

 

J

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

PC1

0.2

 

 

 

0.4

 

 

 

 

 

 

 

L

K

 

 

D

 

−0.2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A

 

 

 

 

 

−0.4

 

 

 

 

 

 

 

B

C

 

 

 

 

 

 

 

 

 

 

 

 

−0.6

 

 

 

 

 

 

 

 

 

Figure 6.4

Scores and loadings plots of PC2 versus PC1 of the raw data in Table 6.1

The simplest plots are the scores and loadings plots of the first two PCs of the raw data (see Figure 6.4). These would suggest that there are two components, with a region of overlap between times 9 and 14, with wavelengths H and G most strongly associated with the slowest eluting compound and wavelengths A, B, C, L and K with the fastest eluting compound. For further discussion of the interpretation of these types of graph, see Section 4.3.5.

The dataset in Table 6.2 is of the same size but represents three partially overlapping peaks. The profile (Figure 6.5) appears to be slightly more complex than that for dataset A, and the PC scores plot presented in Figure 6.6 definitely appears to contain more features. Each turning point represents a pure compound, so it appears that there are three compounds, centred at times 9, 13 and 17. In addition, the spectral characteristics

EVOLUTIONARY SIGNALS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

345

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

L

 

0.0784 0.0746 0.0437

0.0614

0.1335

0.2801

0.7223

0.8289

0.8584

0.8324

0.8146

0.5917

0.6865

0.9762

1.2778

1.7106

1.8048

1.6280

1.1998

0.6602

0.1539

0.0798

0.0518 0.0489 0.0383

 

 

 

 

 

K

 

0.0641 0.0715 0.0654

0.1128

0.1688

0.3498

0.6187

0.8351

1.0880

0.9573

0.7305

0.6364

0.7354

0.8967

1.2267

1.5703

1.6209

1.4607

1.0083

0.4253

0.2760

0.0782

0.0521

0.0664

0.0480

 

 

J

 

0.0257 0.0295 0.0271

0.0396

0.0472

0.3195

0.4346

0.8600

1.0155

1.0264

0.9810

0.7335

0.7738

0.8656

0.9667

1.1937

1.1376

0.9945

0.6982

0.2982

0.2204

0.0370

0.0217

0.1066

0.0319

 

 

I

 

0.0078 0.0341 0.0079

0.0421

0.0898

0.1049

0.3163

0.6141

0.7866

0.7726

0.9984

0.9874

0.9871

0.9289

0.9415

0.8331

0.6828

0.5582

0.3016

0.2330

0.1284

0.0411

0.0036

0.0119

0.0279

 

 

H

 

0.0066

0.0876

0.1114

0.0499

0.0657

0.0382

0.1802

0.3650

0.4519

0.6486

0.9789

1.1761

1.3346

1.2373

0.8403

0.6494

0.3421

0.2421

0.1631

0.0176

0.0322

0.0517

0.0282

0.0258

0.0358

 

 

 

 

 

− −

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

− −

 

 

 

G

 

0.0530

0.0074

0.0365

0.0396

0.0310

0.1147

0.1340

0.2991

0.4163

0.6598

0.9594

1.3408

1.3799

1.2870

0.9244

0.5775

0.2725

0.1154

0.0062

0.0121

0.0367

0.1068

0.0374

0.0199

0.1094

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F

 

0.0404

0.0218

0.0102 0.0213

0.0886

0.0725

0.2480

0.3124

0.3880

0.6088

0.9031

1.2316

1.2065

1.1305

0.7994

0.4061

0.1431

0.0917

0.0012 0.0078

0.0410

0.0023

0.1177 0.0066

0.0637

 

 

E

 

0.0399

0.1072

0.0347

0.0045

0.1594

0.1596

0.3413

0.5933

0.6591

0.7743

0.8587

0.9716

1.0398

0.8882

0.5563

0.2568

0.1279

0.1297

0.0290 0.0180

0.0239

0.0017

0.0020 0.0212

0.0511

 

 

D

 

0.0136

0.0251

0.0229

0.0142

0.1567

0.3477

0.6515

0.9179

1.1441

1.1872

0.9695

0.8352

0.6098

0.5386

0.3367

0.2574

0.1255

0.0936

0.0071

0.0714

0.0689

0.0307

0.0867 0.0875

0.0298

 

 

C

 

0.0059

0.0183

0.0246

0.0155

0.1601

0.5696

0.9926

1.3641

1.5372

1.5099

1.0579

0.7074

0.5143

0.3630

0.2429

0.3146

0.3203

0.1603

0.1668

0.0825

0.0436

0.0418

0.0520 0.0536

0.0093

 

6.2 DatasetB.

A B

 

0.1214 0.0097

0.0750 0.0200

0.0256 0.1103

0.0838 0.0486

0.1956 0.2059

0.4605 0.5753

0.9441 1.1101

1.3161 1.6053

1.5698 1.8485

1.3576 1.6975

1.0215 1.1341

0.5267 0.6154

0.3936 0.3650

0.4351 0.3077

0.7120 0.4754

1.0076 0.5493

1.2155 0.5669

1.1392 0.4750

0.6988 0.4000

0.3291 0.1766

0.2183 0.1892

0.1135 0.0517

0.0442 0.0156

0.0013 0.1103

0.0697 0.0827

 

Table

 

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

346

 

 

 

 

 

 

CHEMOMETRICS

 

14

 

 

 

 

 

 

 

12

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

8

 

 

 

 

 

 

Intensity

6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

1

5

9

13

17

21

25

 

−2

 

 

Datapoint

 

 

 

Figure 6.5

Profile of data in Table 6.2

of the compounds centred at times 9 and 17 are probably similar compared with that centred at time 13. Comparing the loadings plot suggests that wavelengths F, G and H are strongly associated with the middle eluting compound, whereas A, B, J, K and L are associated with the other two compounds. There is some distinction, in that wavelengths A, K and L appear most associated with the slowest eluting compound (centred at time 17) and B and J with the fastest. The loadings and scores could be combined into a biplot (Chapter 4, Section 4.3.7.1).

It is sometimes clearer to present these graphs in three dimensions as in Figures 6.7 and 6.8 adding a third PC. Note that the three-dimensional scores plot for dataset A is not particularly informative and the two-dimensional plot shows the main trends more clearly. The reason for this is that there are only two main components in the system, so the third dimension consists primarily of noise and thus degrades the information. If the three dimensions were scaled according to the size of the PCs (or the eigenvalues), the graphs in Figure 6.7 would be flat. However for dataset B, the directions are much clearer than in the two-dimensional projections, so adding an extra PC can be beneficial if there are more than two significant components.

A useful trick is to normalise the scores. This involves calculating

norm tia =

 

tia

 

 

A

 

tia2

 

a 1

 

 

 

 

 

=

Note that there is often confusing and conflicting terminology in the literature, some authors called this summing to a constant total normalisation, but we will adopt only one convention in this book; however, if you read the original literature be very careful

EVOLUTIONARY SIGNALS

347

 

 

 

 

2

 

 

 

 

 

 

 

 

 

 

 

1.5

 

 

 

 

 

 

13

 

 

 

 

 

 

 

 

 

 

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PC2

1

 

 

 

 

 

 

14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

11

 

 

 

 

0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PC1

 

 

15

10

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−0.5

 

0

0.5

1

1.5

2

2.5

3

3.5

4

 

−0.5

 

21

6

 

7

 

8

9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

20

 

 

 

16

 

 

 

 

−1

 

 

 

19

 

 

 

 

 

 

 

 

 

 

 

 

 

17

 

 

 

 

 

 

 

 

 

 

18

 

 

 

−1.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.6

 

 

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

F

G

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

E

 

 

 

 

 

0.2

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

 

 

I

 

 

 

 

 

 

 

 

 

 

 

 

 

PC2

0

 

 

 

 

PC1

 

 

C

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

0.2

 

 

 

0.4

 

 

 

 

 

 

 

 

B

 

−0.2

 

 

 

 

 

 

 

J

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A

K

 

−0.4

 

 

 

 

 

 

 

 

L

 

 

 

 

 

 

 

 

 

 

 

−0.6

 

 

 

 

 

 

 

 

 

 

Figure 6.6

Scores and loadings plots of PC2 versus PC1 of the raw data in Table 6.1

about terminology. If only two PCs are used this will project the scores on to a circle, whereas if three PCs are used the projection will be on to a sphere. It is best to set A according to the number of compounds in the region of the chromatogram being studied.

Figure 6.9 illustrates the scores of dataset A normalised over two PCs. Between times 3 and 21, the points in the chromatogram are in sequence on the arc of a circle. The extremes (3 and 21) could represent the purest elution times, but points influenced primarily by noise might lie anywhere on the circle. Hence time 25, which is clearly

348

CHEMOMETRICS

 

 

 

0.2

 

 

 

 

 

 

 

 

 

0.1

 

 

 

 

 

 

 

 

PC3

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−0.1

 

 

 

 

 

 

 

 

−0.2

 

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

PC2

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

−1

 

 

 

 

 

 

 

−2

 

 

 

 

 

 

 

 

 

 

 

−1

 

0.4

 

 

 

 

 

 

 

 

0.2

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

PC3

−0.2

 

 

 

 

 

 

 

 

−0.4

 

 

 

 

 

 

 

 

−0.6

 

 

 

 

 

 

 

 

0.6

 

 

 

 

 

 

 

A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.2

 

 

 

 

 

 

 

 

 

 

 

 

PC2

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−0.2

 

 

 

 

 

 

 

 

 

−0.4

0.2

 

 

 

14

 

 

 

15

 

 

19

16

 

 

 

 

 

12

10

 

 

 

 

 

 

13

 

 

17

6

 

8

 

 

11

20

18

 

 

 

 

 

 

21

 

 

 

 

 

 

 

 

9

 

 

 

7

 

0

1

2

3

4

 

PC1

 

 

 

 

 

F

 

 

G

 

J

 

 

I

E

 

K

 

L

H

D

 

 

 

B

 

 

 

C

0.25

0.3

0.35

0.4

 

PC1

 

 

Figure 6.7

Three-dimensional projections of scores (top) and loadings (bottom) for dataset A

EVOLUTIONARY SIGNALS

349

 

 

2

1

PC3

0

−1 −2

−1

0 PC2

9

 

 

8

 

10

 

 

 

 

 

 

 

 

 

 

7

 

 

 

 

 

 

 

 

11

 

 

18

17

16

 

 

 

 

 

 

 

 

 

19

6

 

15

 

 

12

 

 

 

 

14

 

20

 

 

 

 

4

21

 

 

 

 

 

 

 

 

 

 

13

 

 

 

 

 

 

 

 

 

 

 

 

3

 

 

 

 

 

 

2

 

 

 

 

 

1

PC1

 

 

 

 

 

 

 

0

1

−1

 

 

2

1

0.5

PC3 0

−0.5 −0.4

−0.2

0

PC2

A

B

L

K

 

 

 

 

 

 

C

 

 

 

J

 

 

D

 

I

0.4

 

 

 

 

 

H

0.35

 

E

 

 

G

0.3

 

 

0.2

 

F

PC1

 

 

0.25

0.4

0.60.2

Figure 6.8

Three-dimensional projections of scores (top) and loadings (bottom) for dataset B

not representative of the fastest eluting component, is close to time 3 (this is entirely fortuitous and depends on the noise distribution). Because elution times 4–9 are closely clustered, they probably better represent the faster eluting compound. Note how points on a straight line (Figure 6.4) in the raw scores plot project on to clusters in the normalised scores plot.

The normalised scores of dataset B [Figure 6.10(a)] show a clearer pattern. The figure suggests the following:

points 1–4 are mainly noise as they form a fairly random pattern;

the purest points for the fastest eluting peak are 6 and 7, because these correspond to a turning point;

350

 

 

 

 

 

 

CHEMOMETRICS

 

 

 

1.2

 

 

 

 

 

 

21

1

20

 

 

 

 

 

19

 

 

 

 

 

 

 

 

 

 

 

 

 

0.8

22

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

17

 

 

 

 

 

0.6

 

16

 

 

 

 

 

18

15

 

 

 

 

 

 

14

 

 

 

1

 

 

 

24

 

 

 

0.4

 

 

2

 

 

 

 

 

 

13

 

 

 

 

 

 

 

23

 

 

 

 

0.2

 

 

12

 

 

 

PC2

 

 

 

 

 

 

 

 

0

 

 

11

 

−1.5

−1

−0.5

 

 

 

 

0

0.5

 

1

1.5

 

 

 

−0.2

PC1

 

10

 

 

 

 

 

 

 

9

 

 

 

 

−0.4

 

 

678

 

 

 

 

 

 

5

 

 

 

 

 

 

 

4

 

 

 

 

−0.6

 

 

 

 

 

 

 

−0.8

25

3

 

 

 

 

 

−1

 

 

 

 

Figure 6.9

Scores of dataset A normalised over the first two principal components

the purest points for the middle eluting peak are 12 and 13, again a turning point;

the purest points for the slowest eluting peak are 18–20;

points 23–25 are mainly dominated by noise.

It is probably best to remove the noise points 1–4 and 23–15, and show the normalised scores plot as in Figure 6.10(b). Notice that we come to a slightly different conclusion from Figure 6.6 as to which are the most representative elution times (or spectra) for each component. This is mainly because the ends of each limb in the raw scores plot correspond to the peak maxima, which are not necessarily the purest regions. For the fastest and slowest eluting components the purest regions will be at more extreme elution times before noise dominates: if the noise levels are low they may be at the base rather than top of the peak clusters. For the central peak the purest region is still at the same position, probably because this peak does not have a selective or pure region. The data could also be normalised over three dimensions with pure points falling on the surface of a sphere; the clustering becomes more obvious (see Figure 6.11). Note that similar calculations can be performed on the loadings plots and it is possible to normalise the loadings instead.

6.2.3 Scaling the Data

It is also possible to scale the raw data prior to performing PCA.

6.2.3.1 Scaling the Rows

Each successive row in a data matrix formed from a coupled chromatogram corresponds to a spectrum taken at a given elution time. One of the simplest methods of scaling

EVOLUTIONARY SIGNALS

 

 

 

 

 

 

 

351

 

 

 

 

0.8

 

 

 

 

 

 

 

 

PC2

 

 

 

 

 

 

 

2

0.6

 

 

 

 

 

 

 

 

 

13

 

 

 

 

 

 

0.4

12

 

 

 

 

 

 

 

14

 

 

 

 

 

 

0.2

 

11

 

 

 

 

 

 

PC1

15

 

 

 

 

 

0

 

10

 

 

 

 

 

 

5

 

−1.5

−13

 

−0.5

0

0.5

1.5

24

1 9

8

 

 

 

 

−0.2

 

67

 

 

 

 

 

 

 

16

 

 

 

 

 

 

−0.4

1

17

 

 

 

 

 

 

21

22

 

 

 

 

 

 

 

1819

 

 

 

 

 

 

−0.6

20

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

23

−0.8

25

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−1

 

 

 

 

 

 

 

 

(a) Over entire region

 

 

 

 

 

0.6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

13

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

14

 

 

 

 

 

 

0.2

 

 

 

 

 

 

 

 

 

11

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

15

 

PC2

0

 

 

 

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

 

 

 

 

15

 

0.82

0.84

0.86

0.88

0.9

 

0.92

0.94

0.96

0.98

1.02

 

 

 

 

 

 

 

PC1

 

 

 

 

 

 

 

9

 

 

−0.2

 

 

 

 

 

 

 

 

 

8

 

 

 

 

 

 

 

 

 

 

 

6 7

 

 

 

 

 

 

 

 

 

 

16

 

 

 

−0.4

 

 

 

 

21

22

 

 

 

 

 

 

 

 

 

 

18

 

17

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−0.6

20

19

 

 

 

 

 

 

 

 

 

 

(b) Expansion of central regions

Figure 6.10

Scores of dataset B normalised over the first two principal components (a) Over entire region (b) Expansion of central regions

Соседние файлы в предмете Химия