- •Preface
- •1.1 Introduction
- •1.2 Models and modelling
- •1.3 The learning process for mathematical modelling
- •Summary
- •Aims and objectives
- •2.1 Introduction
- •2.2 Examples
- •2.3 Further examples
- •Appendix 1
- •Appendix 2
- •Aims and objectives
- •3.1 Introduction
- •3.2 Definitions and terminology
- •3.3 Methodology and modelling flow chart
- •3.4 The methodology in practice
- •Background to the problem
- •Summary
- •Aims and objectives
- •4.1 Introduction
- •4.2 Listing factors
- •4.3 Making assumptions
- •4.4 Types of behaviour
- •4.5 Translating into mathematics
- •4.6 Choosing mathematical functions
- •Case 1
- •Case 2
- •Case 3
- •4.7 Relative sizes of terms
- •4.8 Units
- •4.9 Dimensions
- •4.10 Dimensional analysis
- •Summary
- •Aims and objectives
- •5.1 Introduction
- •5.2 First-order linear difference equations
- •5.3 Tending to a limit
- •5.4 More than one variable
- •5.5 Matrix models
- •5.6 Non-linear models and chaos
- •5.7 Using spreadsheets
- •Aims and objectives
- •6.1 Introduction
- •6.2 First order, one variable
- •6.3 Second order, one variable
- •6.4 Second order, two variables (uncoupled)
- •6.5 Simultaneous coupled differential equations
- •Summary
- •Aims and objectives
- •7.1 Introduction
- •7.2 Modelling random variables
- •7.3 Generating random numbers
- •7.4 Simulations
- •7.5 Using simulation models
- •7.6 Packages and simulation languages
- •Summary
- •Aims and objectives
- •8.1 Introduction
- •8.2 Data collection
- •8.3 Empirical models
- •8.4 Estimating parameters
- •8.5 Errors and accuracy
- •8.6 Testing models
- •Summary
- •Aims and objectives
- •9.1 Introduction
- •9.2 Driving speeds
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Rewritten problem statement
- •Obtain the mathematical solution
- •9.3 Tax on cigarette smoking
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •9.4 Shopping trips
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •Interpret the mathematical solution
- •Using the model
- •9.5 Disk pressing
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •Interpret the mathematical solution
- •Further thoughts
- •9.6 Gutter
- •Context and problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •9.7 Turf
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •Interpret the solution
- •9.8 Parachute jump
- •Context and problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •9.9 On the buses
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •9.10 Further battles
- •Discrete deterministic model
- •Discrete stochastic model
- •Comparing the models
- •9.11 Snooker
- •Context
- •Problem statement
- •Formulate a mathematical model
- •Obtain the mathematical solution
- •Interpret the mathematical solution
- •9.12 Further models
- •Mileage
- •Heads or tails
- •Picture hanging
- •Motorway
- •Vehicle-merging delay at a junction
- •Family names
- •Estimating animal populations
- •Simulation of population growth
- •Needle crystals
- •Car parking
- •Overhead projector
- •Sheep farming
- •Aims and objectives
- •10.1 Introduction
- •10.2 Report writing
- •Preliminary
- •Main body
- •Appendices
- •Summary
- •General remarks
- •10.3 A specimen report
- •Contents
- •1 PRELIMINARY SECTIONS
- •1.1 Summary and conclusions
- •1.2 Glossary
- •2 MAIN SECTIONS
- •2.1 Problem statement
- •2.2 Assumptions
- •2.3 Individual testing
- •2.4 Single-stage procedure
- •2.5 Two-stage procedure
- •2.6 Results
- •2.7 Regular section procedures
- •2.8 Conclusions
- •3 APPENDICES
- •3.1 Possible extensions
- •3.2 Mathematical analysis
- •10.4 Presentation
- •Preparation
- •Giving the presentation
- •Bibliography
- •Solutions to Exercises
- •Chapter 2
- •Example 2.2 – Double wiper overlap problem
- •Chapter 4
- •Chapter 5
- •Chapter 6
- •Chapter 8
- •Index
2 MAIN SECTIONS
2.1 Problem statement
Suppose that we have a finite population of N individuals, each with a fixed probability of having a certain (rare) condition and we wish to identify all the affected individuals. We assume that samples of blood are available from all the individuals concerned and that a test is available which will detect the presence of the condition. If the testing procedure is expensive or difficult to carry out, we shall wish to minimise the total number of tests required to find all the affected individuals.
Suppose that it is possible to apply the test to the pooled blood sample from a number of individuals such that a positive result will be indicated if at least one of the individuals is affected. We can then save on the number of tests since a negative result will allow all individuals in the pool to be cleared.
There are a number of different testing procedures which could be used in these circumstances and the problem posed here is to find a procedure which will minimise the expectation of the total number of tests.
2.2 Assumptions
For the remainder of this report we shall assume the following.
1.The test in question can be applied to a pooled sample of blood made up from the samples taken from the individuals.
2.The test is 100% reliable in the sense that a negative response means that all individuals who contributed to the pooled sample must be clear and that a positive response will only be obtained when at least one individual in the pool is affected.
3.The amount of blood sampled from each individual is sufficient to be divided into a number of parts for subsequent testing.
4.Every individual involved has the same inherent probability P of being affected.
2.3Individual testing
The most direct procedure is of course to apply the test to every individual. We show in the next section that this is in fact the best procedure if P exceeds about 0.3 but for smaller values of P the use of pooled samples gives a saving in the expected number of tests.
2.4 Single-stage procedure
In this procedure, we divide the population into a number of groups ( L, say) and test a pooled sample from each group. Each group contains K = N/L individuals and each test is a Bernoulli trial with probability (1 − P ) K of a negative response. If X is the number of groups giving a positive response, the distribution of X will be binomial ( L, P ′), where P ′ = 1 − (1 − P ) K .
For each group with a positive response, all the individuals in that group will be tested. This requires XK tests. The expected number of tests is therefore
or, in terms of K, the size of each group,
(1)
K has to be an integer of course but, if we regard the expression on the right-hand side as a function of a continuous variable K, we obtain curves such as those shown in Figure 1. For values of P < 1 − exp[−4 exp (−4 exp (−2)]( 0.418), the curves have a local minimum for K > 0 so that the expected number of tests can be minimised by an appropriate choice of K.
Figure 1
Equation (1) shows that the optimal choice of K depends on P only. We can get an approximation to this optimal K value for small P by using the relation 1 − (1 − P ) K KP; then
Differentiation with respect to K gives
From the graphs in Figure 1 we see that, as K → ∞, we have 1/ K + 1 − (1 − P ) K → 1 so that E(T)/N can be brought close to 1 by choosing a sufficiently large value of K. In practice of course there is an upper limit on K given by the value of N . For values of P ≤ 1 − exp[− exp(−1)]( 0.308), the local minimum for E(T)/N does give the overall minimum. For values of P larger than this, however, the overall minimum for E(T)/N is 1.0 and the best choice is to make K as large as possible, in other words to test each individual in the population.
The distribution of T/N can be found from the fact that T/N = 1/ K + KX / N where X is binomial
( N/K, 1 − (1 − P ) K ).
2.5 Two-stage procedure
In this procedure the first stage is to divide the population into L 1 groups with K 1 = N/L 1 individuals in each group and to test the pooled blood samples from each. This requires L 1 tests to be carried out. Suppose that X 1 of these are
positive. In the second stage, we divide all the positive groups into L 2 subgroups with K 2 = N/L 1 L 2 individuals in each and test the pooled blood samples from each subgroup. This requires X 1 L 2 tests. Suppose that X 2 of these tests are positive. For each positive result, we test every individual in that subgroup. The total number of tests for the whole procedure is
where the random variable X 1 has the binomial distribution ( L 1 , P ′) and the distribution of the random variable X 2 , conditional on X 1 , is binomial ( X 1 L 2 , P ′′), where P ′ = 1 − (1 − P ) N/L 1
and We have E ( X 1 ) = L 1 P ′ and E ( X 2 ) = L 1 L 2 P ′ P ′′, so that E(T) = L 1 + L 1 L 2 P ′ + NP ′ P ′′. As with the single-stage procedure, it is mathematically more convenient to express E (T) in terms of K 1 and K 2 , which give us
(2)
Using the approximation 1 − (1 − P ) X XP for small P, we get
On setting the partial derivatives with respect to K 1 and K 2 equal to zero, we find that for a stationary point of E(T) regarded as a function of K 1 and K 2 we have
(3)
In actual fact, of course, K 1 / K 2 must be an integer. As with the single-stage procedure the optimal choices of K 1 and K 2 are independent of N, being dependent on P only.
2.6 Results
In Table 1 the values of K, K 1 and K 2 giving minimum E(T) are recorded for various values of P . These were found by direct search from the exact expressions (1) and (2). A marked reduction in the expected number of tests is revealed when the two-stage method is used, especially at low values of P. The approximations given in equations (3) are found to be reasonably accurate. For example, at P = 0.02, equations (3) give K 1 = 13 and K 2 = 7 while the actual optimum is at K 1 = 16, K 2 = 8. To indicate the distribution of T, a simulation was carried out using a random-number generator. The following results were obtained from 300 runs using a population size N = 96 (for convenience) and P = 0.02. The distribution of T using the two-stage procedure with K 1 = 16 and K 2 = 8 is as follows.
Table 1
This has a mean value of 12.76 and a variance of 57.42. The calculated value of N is 0.133, which agrees well with the theoretical prediction of 0.138 from Table 1.
For comparison, a simulation of 300 runs using the single-stage procedure with K = 8 (for N = 96, P = 0.02) gave the following distribution of .
This has a mean value of 27.2 and a variance of 128.64. The calculated value of N is 0.283 compared with the theoretical value of 0.274.
2.7 Regular section procedures
In a bisection procedure an initial test of the pooled sample for the population, if proved positive, is