Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Brereton Chemometrics

.pdf
Скачиваний:
46
Добавлен:
15.08.2013
Размер:
4.3 Mб
Скачать

Chemometrics: Data Analysis for the Laboratory and Chemical Plant.

Richard G. Brereton

Copyright 2003 John Wiley & Sons, Ltd.

ISBNs: 0-471-48977-8 (HB); 0-471-48978-6 (PB)

Chemometrics

Chemometrics

Data Analysis for the Laboratory and Chemical Plant

Richard G. Brereton

University of Bristol, UK

Copyright 2003 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England

Telephone (+44) 1243 779777

Email (for orders and customer service enquiries): cs-books@wiley.co.uk

Visit our Home Page on www.wileyeurope.com or www.wiley.com

All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to (+44) 1243 770571.

This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Other Wiley Editorial Offices

John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA

Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA

Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany

John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia

John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809

John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Library of Congress Cataloging-in-Publication Data

Brereton, Richard G.

Chemometrics : data analysis for the laboratory and chemical plant / Richard Brereton. p. cm.

Includes bibliographical references and index.

ISBN 0-471-48977-8 (hardback : alk. paper) – ISBN 0-470-84911-8 (pbk. : alk. paper) 1. Chemistry, Analytic–Statistical methods–Data processing. 2. Chemical

processes–Statistical methods–Data processing. I. Title.

QD75.4.S8 B74 2002

 

543 .007 27–dc21

2002027212

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

ISBN 0-471-48977-8 (Hardback)

ISBN 0-471-48978-6 (Paperback)

Typeset in 10/12pt Times by Laserwords Private Limited, Chennai, India Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire

This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production.

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Supplementary Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.1

Points of View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Software and Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.3

Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

 

1.3.1

General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

 

1.3.2

Specific Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

 

1.3.3

Internet Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

1.4

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.2

Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

 

2.2.1 Degrees of Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

 

2.2.2

Analysis of Variance and Comparison of Errors . . . . . . . . . . . . . . . . . . . . . . . . .

23

 

2.2.3

Design Matrices and Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

 

2.2.4

Assessment of Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

 

2.2.5 Leverage and Confidence in Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

2.3

Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

 

2.3.1

Full Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

 

2.3.2

Fractional Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

 

2.3.3 Plackett–Burman and Taguchi Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

 

2.3.4

Partial Factorials at Several Levels: Calibration Designs . . . . . . . . . . . . . . . . . . .

69

2.4

Central Composite or Response Surface Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

 

2.4.1

Setting Up the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

 

2.4.2 Degrees of Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

 

2.4.3

Axial Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

 

2.4.4

Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

 

2.4.5

Statistical Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

2.5

Mixture Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

 

2.5.1 Mixture Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

 

2.5.2

Simplex Centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

 

2.5.3

Simplex Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

 

2.5.4

Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

 

2.5.5

Process Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

vi

CONTENTS

 

 

2.6 Simplex Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.6.1 Fixed Sized Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.6.2 Elaborations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 2.6.3 Modified Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 2.6.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

3 Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.1 Sequential Signals in Chemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.1.1 Environmental and Geological Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.1.2 Industrial Process Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 3.1.3 Chromatograms and Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 3.1.4 Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 3.1.5 Advanced Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 3.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 3.2.1 Peakshapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 3.2.2 Digitisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 3.2.3 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 3.2.4 Sequential Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

3.3 Linear Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 3.3.1 Smoothing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 3.3.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 3.3.3 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 3.4 Correlograms and Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 3.4.1 Auto-correlograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 3.4.2 Cross-correlograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 3.4.3 Multivariate Correlograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

3.5 Fourier Transform Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 3.5.1 Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 3.5.2 Fourier Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 3.5.3 Convolution Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 3.6 Topical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.6.1 Kalman Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.6.2 Wavelet Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 3.6.3 Maximum Entropy (Maxent) and Bayesian Methods . . . . . . . . . . . . . . . . . . . . . . 168

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

4 Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.1.1 Exploratory Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.1.2 Unsupervised Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.1.3 Supervised Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 4.2 The Concept and Need for Principal Components Analysis . . . . . . . . . . . . . . . . . . . . . 184 4.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 4.2.2 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 4.2.3 Multivariate Data Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 4.2.4 Aims of PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

CONTENTS

vii

 

 

4.3 Principal Components Analysis: the Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 4.3.1 Chemical Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 4.3.2 Scores and Loadings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 4.3.3 Rank and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 4.3.4 Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 4.3.5 Graphical Representation of Scores and Loadings . . . . . . . . . . . . . . . . . . . . . . . . 205 4.3.6 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 4.3.7 Comparing Multivariate Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

4.4 Unsupervised Pattern Recognition: Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 224 4.4.1 Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 4.4.2 Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 4.4.3 Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 4.4.4 Dendrograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

4.5 Supervised Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 4.5.1 General Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 4.5.2 Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 4.5.3 SIMCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 4.5.4 Discriminant PLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 4.5.5 K Nearest Neighbours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

4.6 Multiway Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 4.6.1 Tucker3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 4.6.2 PARAFAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 4.6.3 Unfolding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

5 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

271

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

271

 

5.1.1 History and Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

271

 

5.1.2 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

273

 

5.1.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

273

5.2

Univariate Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

276

 

5.2.1

Classical Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

276

 

5.2.2

Inverse Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

279

 

5.2.3

Intercept and Centring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

280

5.3

Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

284

 

5.3.1

Multidetector Advantage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

284

 

5.3.2

Multiwavelength Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

284

 

5.3.3

Multivariate Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

288

5.4

Principal Components Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

292

 

5.4.1

Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

292

 

5.4.2

Quality of Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

295

5.5

Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

297

 

5.5.1 PLS1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

298

 

5.5.2 PLS2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

303

 

5.5.3 Multiway PLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

307

5.6

Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

313

 

5.6.1

Autoprediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

313

viii

CONTENTS

 

 

5.6.2 Cross-validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 5.6.3 Independent Test Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

6 Evolutionary Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

339

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

339

6.2

Exploratory Data Analysis and Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

341

 

6.2.1

Baseline Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

341

 

6.2.2 Principal Component Based Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

342

 

6.2.3

Scaling the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

350

 

6.2.4

Variable Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

360

6.3

Determining Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

365

 

6.3.1 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

365

 

6.3.2 Univariate Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

367

 

6.3.3

Correlation and Similarity Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .

372

 

6.3.4 Eigenvalue Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

376

 

6.3.5

Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

380

6.4

Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

386

 

6.4.1

Selectivity for All Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

387

 

6.4.2

Partial Selectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

392

 

6.4.3

Incorporating Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

396

Problems

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

398

Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 A.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 A.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 A.3 Basic Statistical Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 A.4 Excel for Chemometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 A.5 Matlab for Chemometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

479

Preface

This text is a product of several years activities from myself. First and foremost, the task of educating students in my research group from a wide variety of backgrounds over the past 10 years has been a significant formative experience, and this has allowed me to develop a large series of problems which we set every 3 weeks and present answers in seminars. From my experience, this is the best way to learn chemometrics! In addition, I have had the privilege to organise international quality courses mainly for industrialists with the participation as tutors of many representatives of the best organisations and institutes around the world, and I have learnt from them. Different approaches are normally taken when teaching industrialists who may be encountering chemometrics for the first time in mid-career and have a limited period of a few days to attend a condensed course, and university students who have several months or even years to practice and improve. However, it is hoped that this book represents a symbiosis of both needs.

In addition, it has been a great inspiration for me to write a regular fortnightly column for Chemweb (available to all registered users on www.chemweb.com) and some of the material in this book is based on articles first available in this format. Chemweb brings a large reader base to chemometrics, and feedback via e-mails or even travels around the world have helped me formulate my ideas. There is a very wide interest in this subject but it is somewhat fragmented. For example, there is a strong group of near-infrared spectroscopists, primarily in the USA, that has led to the application of advanced ideas in process monitoring, who see chemometrics as a quite technical industrially oriented subject. There are other groups of mainstream chemists who see chemometrics as applicable to almost all branches of research, ranging from kinetics to titrations to synthesis optimisation. Satisfying all these diverse people is not an easy task.

This book relies heavily on numerical examples: many in the body of the text come from my favourite research interests, which are primarily in analytical chromatography and spectroscopy; to have expanded the text more would have produced a huge book of twice the size, so I ask the indulgence of readers whose area of application may differ. Certain chapters, such as that on calibration, could be approached from widely different viewpoints, but the methodological principles are the most important and if you understand how the ideas can be applied in one area you will be able to translate to your own favourite application. In the problems at the end of each chapter I cover a wider range of applications to illustrate the broad basis of these methods. The emphasis of this book is on understanding ideas, which can then be applied to a wide variety of problems in chemistry, chemical engineering and allied disciplines.

It was difficult to select what material to include in this book without making it too long. Every expert to whom I have shown this book has made suggestions for new material. Some I have taken into account and I am most grateful for every proposal, others I have mentioned briefly or not at all, mainly for reasons of length and also to ensure that this text sees the light of day rather than constantly expands without end.

x

CHEMOMETRICS

There are many outstanding specialist books for the enthusiast. It is my experience, though, that if you understand the main principles (which are quite few in number), and constantly apply them to a variety of problems, you will soon pick up the more advanced techniques, so it is the building blocks that are most important.

In a book of this nature it is very difficult to decide on what detail is required for the various algorithms: some readers will have no real interest in the algorithms, whereas others will feel the text is incomplete without comprehensive descriptions. The main algorithms for common chemometric methods are presented in Appendix A.2. Step- by-step descriptions of methods, rather than algorithms, are presented in the text. A few approaches that will interest some readers, such as cross-validation in PLS, are described in the problems at the end of appropriate chapters which supplement the text. It is expected that readers will approach this book with different levels of knowledge and expectations, so it is possible to gain a great deal without having an in-depth appreciation of computational algorithms, but for interested readers the information is nevertheless available. People rarely read texts in a linear fashion, they often dip in and out of parts of it according to their background and aspirations, and chemometrics is a subject which people approach with very different types of previous knowledge and skills, so it is possible to gain from this book without covering every topic in full. Many readers will simply use Add-ins or Matlab commands and be able to produce all the results in this text.

Chemometrics uses a very large variety of software. In this book we recommend two main environments, Excel and Matlab; the examples have been tried using both environments, and you should be able to get the same answers in both cases. Users of this book will vary from people who simply want to plug the data into existing packages to those that are curious and want to reproduce the methods in their own favourite language such as Matlab, VBA or even C. In some cases instructors may use the information available with this book to tailor examples for problem classes. Extra software supplements are available via the publisher’s www. SpectroscopyNOW.com Website, together with all the datasets and solutions associated with this book.

The problems at the end of each chapter form an important part of the text, the examples being a mixture of simulations (which have an important role in chemometrics) and real case studies from a wide variety of sources. For each problem the relevant sections of the text that provide further information are referenced. However, a few problems build on the existing material and take the reader further: a good chemometrician should be able to use the basic building blocks to understand and use new methods. The problems are of various types, so not every reader will want to solve all the problems. Also, instructors can use the datasets to construct workshops or course material that go further than the book.

I am very grateful for the tremendous support I have had from many people when asking for information and help with datasets, and permission where required. Chemweb is thanked for agreement to present material modified from articles originally published in their e-zine, The Alchemist, and the Royal Society of Chemistry for permission to base the text of Chapter 5 on material originally published in The Analyst [125, 2125–2154 (2000)]. A full list of acknowledgements for the datasets used in this text is presented after this preface.

Tom Thurston and Les Erskine are thanked for a superb job on the Excel add-in, and Hailin Shen for outstanding help with Matlab. Numerous people have tested out the answers to the problems. Special mention should be given to Christian Airiau, Kostas

PREFACE

xi

 

 

Zissis, Tom Thurston, Conrad Bessant and Cevdet Demir for access to a comprehensive set of answers on disc for a large number of exercises so I can check mine. In addition, several people have read chapters and made detailed comments, particularly checking numerical examples. In particular, I thank Hailin Shen for suggestions about improving Chapter 6 and Mohammed Wasim for careful checking of errors. In some ways the best critics are the students and postdocs working with me, because they are the people that have to read and understand a book of this nature, and it gives me great confidence that my co-workers in Bristol have found this approach useful and have been able to learn from the examples.

Finally I thank the publishers for taking a germ of an idea and making valuable suggestions as to how this could be expanded and improved to produce what I hope is a successful textbook, and having faith and patience over a protracted period.

Bristol, February 2002

Richard Brereton

Соседние файлы в предмете Химия