1 The Poisson approximation to the binomial distribution
The probability of the outcome of a set of Bernoulli trials can always be calculated by using the formula
given above. Clearly, for very large values of the calculation can be rather tedious, this is particularly so when very small values of are also present. In the situation when is large and is small and the product is constant we can take a different approach to the problem of calculating the probability that . In the table below the values of have been calculated for various combinations of and under the constraint that . You should try some of the calculations for yourself using the formula given above for some of the smaller values of .
Probability of X successes
4 | 0.25 | 0.316 | 0.422 | 0.211 | 0.047 | 0.004 | ||
5 | 0.20 | 0.328 | 0.410 | 0.205 | 0.051 | 0.006 | 0.000 | |
10 | 0.10 | 0.349 | 0.387 | 0.194 | 0.058 | 0.011 | 0.001 | 0.000 |
20 | 0.05 | 0.359 | 0.377 | 0.189 | 0.060 | 0.013 | 0.002 | 0.000 |
100 | 0.01 | 0.366 | 0.370 | 0.185 | 0.061 | 0.014 | 0.003 | 0.001 |
1000 | 0.001 | 0.368 | 0.368 | 0.184 | 0.061 | 0.015 | 0.003 | 0.001 |
10000 | 0.0001 | 0.368 | 0.368 | 0.184 | 0.061 | 0.015 | 0.003 | 0.001 |
Each of the binomial distributions given has a mean given by . Notice that the probabilities that approach the values as increases.
If we have to determine the probabilities of success when large values of and small values of are involved it would be very convenient if we could do so without having to construct tables. In fact we can do such calculations by using the Poisson distribution which, under certain constraints, may be considered as an approximation to the binomial distribution.
By considering simplifications applied to the binomial distribution subject to the conditions
- is large
- is small
- ( a constant)
we can derive the formula
as an approximation to .
This is the Poisson distribution given previously. We now show how this is done. We know that the binomial distribution is given by
Condition (2) tells us that since is small, is approximately equal to 1. Applying this to the terms of the binomial expansion above we see that the right-hand side becomes
Applying condition (1) allows us to approximate terms such as to (mathematically, we are allowing ) and the right-hand side of our expansion becomes
Note that the term under these conditions and hence has been omitted.
We now have the series
which, using condition (3) may be written as
You may recognise this as the expansion of .
If we are to be able to claim that the terms of this expansion represent probabilities, we must be sure that the sum of the terms is 1. We divide by to satisfy this condition. This gives the result
The terms of this expansion are very good approximations to the corresponding binomial expansion under the conditions
- is large
- is small
- ( constant)
The Poisson approximation to the binomial distribution is summarized below.
Key Point 6
Poisson Approximation to the Binomial Distribution
Assuming that is large, is small and that is constant, the terms
of a binomial distribution may be closely approximated by the terms
of the Poisson distribution for corresponding values of r.
Example 12
We introduced the binomial distribution by considering the following scenario. A worn machine is known to produce 10% defective components. If the random variable is the number of defective components produced in a run of 3 components, find the probabilities that takes the values 0 to 3.
Suppose now that a similar machine which is known to produce 1% defective components is used for a production run of 40 components. We wish to calculate the probability that two defective items are produced. Essentially we are assuming that and are asking for . We use both the binomial distribution and its Poisson approximation for comparison.
Solution
Using the binomial distribution we have the solution
Note that the arithmetic involved is unwieldy. Using the Poisson approximation we have the solution
Note that the arithmetic involved is simpler and the approximation is reasonable.
1.1 Practical considerations
In practice, we can use the Poisson distribution to very closely approximate the binomial distribution provided that the product is constant with
Note that this is not a hard-and-fast rule and we simply say that
‘the larger is the better and the smaller is the better provided that is a sensible size.’
The approximation remains good provided that for values of as low as 20.
Task!
Mass-produced needles are packed in boxes of 1000. It is believed that 1 needle in 2000 on average is substandard. What is the probability that a box contains 2 or more defectives? The correct model is the binomial distribution with (and ).
-
Using the binomial distribution calculate
and hence
:
Hence (2 or more defectives) .
-
Now choose a suitable value for
in order to use a Poisson model to approximate the probabilities:
Now recalculate the probability that there are 2 or more defectives using the Poisson distribution with :
(4 d.p.)
Hence (2 or more defectives) .
In the above Task we have obtained the same answer to 4 d.p., as the exact binomial calculation, essentially because was so small. We shall not always be so lucky!
Example 13
In the manufacture of glassware, bubbles can occur in the glass which reduces the status of the glassware to that of a ‘second’. If, on average, one in every 1000 items produced has a bubble, calculate the probability that exactly six items in a batch of three thousand are seconds.
Solution
Suppose that = number of items with bubbles, then
Since and we can use the Poisson distribution with . The calculation is:
The result means that we have about a 5% chance of finding exactly six seconds in a batch of three thousand items of glassware.
Example 14
A manufacturer produces light-bulbs that are packed into boxes of 100. If quality control studies indicate that 0.5% of the light-bulbs produced are defective, what percentage of the boxes will contain:
- no defective?
- 2 or more defectives?
Solution
As is large and , the (defective bulb), is small, use the Poisson approximation to the binomial probability distribution. If = number of defective bulbs in a box, then
where
-
but it is easier to consider:
i.e.