1 The binomial model

We have introduced random variables from a general perspective and have seen that there are two basic types: discrete and continuous. We examine four particular examples of distributions for random variables which occur often in practice and have been given special names. They are the binomial distribution, the Poisson distribution, the Hypergeometric distribution and the Normal distribution. The first three are distributions for discrete random variables and the fourth is for a continuous random variable. In this Section we focus attention on the binomial distribution.

The binomial distribution can be used in situations in which a given experiment (often referred to, in this context, as a trial ) is repeated a number of times. For the binomial model to be applied the following four criteria must be satisfied:

For example, if we consider throwing a coin 7 times what is the probability that exactly 4 Heads occur? This problem can be modelled by the binomial distribution since the four basic criteria are assumed satisfied as we see.

The reader will be able to complete the solution to this example once we have constructed the general binomial model.

The following two scenarios are typical of those met by engineers. The reader should check that the criteria stated above are met by each scenario.

  1. An electronic product has a total of 30 integrated circuits built into it. The product is capable of operating successfully only if at least 27 of the circuits operate properly. What is the probability that the product operates successfully if the probability of any integrated circuit failing to operate is 0.01?
  2. Digital communication is achieved by transmitting information in “bits”. Errors do occur in data transmissions. Suppose that the number of bits in error is represented by the random variable X and that the probability of a communication error in a bit is 0.001. If at most 2 errors are present in a 1000 bit transmission, the transmission can be successfully decoded. If a 1000 bit message is transmitted, find the probability that it can be successfully decoded.

Before developing the general binomial distribution we consider the following examples which, as you will soon recognise, have the basic characteristics of a binomial distribution.

Example 7

In a box of floppy discs it is known that 95% will work. A sample of three of the discs is selected at random.
Find the probability that

  1. none
  2. 1,  
  3. 2,
  4. all 3 of the sample will work.
Solution

Let the event {the disc works} be W and the event {the disc fails} be F . The probability that a disc will work is denoted by P ( W ) and the probability that a disc will fail is denoted by P ( F ) . Then P ( W ) = 0.95 and P ( F ) = 1 P ( W ) = 1 0.95 = 0 . 05 .

  1. The probability that none of the discs works equals the probability that all 3 discs fail. This is given by: P ( none work ) = P ( F F F ) = P ( F ) × P ( F ) × P ( F ) as the events are independent = 0.05 × 0.05 × 0.05 = 0.0 5 3 = 0.000125
  2. If only one disc works then you could select the three discs in the following orders

    ( F F W ) or ( F W F ) or ( W F F ) hence

    P ( one works ) = P ( F F W ) + P ( F W F ) + P ( W F F ) = P ( F ) × P ( F ) × P ( W ) + P ( F ) × P ( W ) × P ( F ) + P ( W ) × P ( F ) × P ( F ) = ( 0.05 × 0.05 × 0.95 ) + ( 0.05 × 0.95 × 0.05 ) + ( 0.95 × 0.05 × 0.05 ) = 3 × ( 0.05 ) 2 × 0.95 = 0.007125
  3. If 2 discs work you could select them in order

    ( F W W ) or ( W F W ) or ( W W F ) hence

    P ( two work ) = P ( F W W ) + P ( W F W ) + P ( W W F ) = P ( F ) × P ( W ) × P ( W ) + P ( W ) × P ( F ) × P ( W ) + P ( W ) × P ( W ) × P ( F ) = ( 0.05 × 0.95 × 0.95 ) + ( 0.95 × 0.05 × 0.95 ) + ( 0.95 × 0.95 × 0.05 ) = 3 × ( 0.05 ) × ( 0.95 ) 2 = 0.135375
  4. The probability that all 3 discs work is given by P ( W W W ) = 0.9 5 3 = 0.857375 .

    Notice that since the 4 outcomes we have dealt with are all possible outcomes of selecting 3 discs, the probabilities should add up to 1. It is an easy check to verify that they do.

    One of the most important assumptions above is that of independence .The probability of selecting a working disc remains unchanged no matter whether the previous selected disc worked or not.

Example 8

A worn machine is known to produce 10% defective components. If the random variable X is the number of defective components produced in a run of 3 components, find the probabilities that X takes the values 0 to 3.

Solution

Assuming that the production of components is independent and that the probability p = 0.1 of producing a defective component remains constant, the following table summarizes the production run. We let G represent a good component and let D represent a defective component.
Note that since we are only dealing with two possible outcomes, we can say that the probability q of the machine producing a good component is 1 0.1 = 0.9 . More generally, we know that q + p = 1 if we are dealing with a binomial distribution.

Outcome Value of X Probability of Occurrence
GGG 0 ( 0.9 ) ( 0.9 ) ( 0.9 ) = ( 0.9 ) 3
GGD 1 ( 0.9 ) ( 0.9 ) ( 0.1 ) = ( 0.9 ) 2 ( 0.1 )
GDG 1 ( 0.9 ) ( 0.1 ) ( 0.9 ) = ( 0.9 ) 2 ( 0.1 )
DGG 1 ( 0.1 ) ( 0.9 ) ( 0.9 ) = ( 0.9 ) 2 ( 0.1 )
DDG 2 ( 0.1 ) ( 0.1 ) ( 0.9 ) = ( 0.9 ) ( 0.1 ) 2
DGD 2 ( 0.1 ) ( 0.9 ) ( 0.1 ) = ( 0.9 ) ( 0.1 ) 2
GDD 2 ( 0.9 ) ( 0.1 ) ( 0.1 ) = ( 0.9 ) ( 0.1 ) 2
DDD 3 ( 0.1 ) ( 0.1 ) ( 0.1 ) = ( 0.1 ) 3

From this table it is easy to see that

P ( X = 0 ) = ( 0.9 ) 3

P ( X = 1 ) = 3 × ( 0.9 ) 2 ( 0.1 )

P ( X = 2 ) = 3 × ( 0.9 ) ( 0.1 ) 2

P ( X = 3 ) = ( 0.1 ) 3

Clearly, a pattern is developing. In fact you may have already realized that the probabilities we have found are just the terms of the expansion of the expression ( 0.9 + 0.1 ) 3 since

( 0.9 + 0.1 ) 3 = ( 0.9 ) 3 + 3 × ( 0.9 ) 2 ( 0.1 ) + 3 × ( 0.9 ) ( 0.1 ) 2 + ( 0.1 ) 3

We now develop the binomial distribution from a more general perspective. If you find the theory getting a bit heavy simply refer back to this example to help clarify the situation.

First we shall find it convenient to denote the probability of failure on a trial, which is 1 p , by q , that is:

q = 1 p .

What we shall do is to calculate probabilities of the number of ‘successes’ occurring in n trials, beginning with n = 1 .

n = 1 ̲ With only one trial we can observe either 1 success (with probability p ) or 0 successes (with probability q ).

n = 2 ̲ Here there are 3 possibilities: We can observe 2, 1 or 0 successes. Let S denote a success and F denote a failure. So a failure followed by a success would be denoted by F S whilst two failures followed by one success would be denoted by F F S and so on.
Then

P ( 2 successes in 2 trials ) = P ( S S ) = P ( S ) P ( S ) = p 2

(where we have used the assumption of independence between trials and hence multiplied probabilities). Now, using the usual rules of basic probability, we have:

P ( 1 success in 2 trials ) = P [ ( S F ) ( F S ) ] = P ( S F ) + P ( F S ) = p q + q p = 2 p q

P ( 0 successes in 2 trials ) = P ( F F ) = P ( F ) P ( F ) = q 2

The three probabilities we have found q 2 , 2 q p , p 2 are in fact the terms which arise in the binomial expansion of ( q + p ) 2 = q 2 + 2 q p + p 2 . We also note that since q = 1 p the probabilities sum to 1 (as we should expect):

q 2 + 2 q p + p 2 = ( q + p ) 2 = ( ( 1 p ) + p ) 2 = 1

Task!

List the outcomes for the binomial model for the case n = 3 , calculate their probabilities and display the results in a table.

{three successes, two successes, one success, no successes}

Three successes occur only as S S S with probability p 3 .

Two successes can occur as S S F with probability ( p 2 q ) , as S F S with probability ( p q p ) or as F S S with probability ( q p 2 ) .

These are mutually exclusive events so the combined probability is the sum 3 p 2 q .

Similarly, we can calculate the other probabilities and obtain the following table of results.

Number of successes 3 2 1 0
Probability p 3 3 p 2 q 3 p q 2 q 3

Note that the probabilities you have obtained:

q 3 , 3 q 2 p , 3 q p 2 , p 3

are the terms which arise in the binomial expansion of ( q + p ) 3 = q 3 + 3 q 2 p + 3 q p 2 + p 3

Task!

Repeat the previous Task for the binomial model for the case with n = 4 .

Number of successes 4 3 2 1 0
Probability p 4 4 p 3 q 6 p 2 q 2 4 p q 3 q 4

Again we explore the connection between the probabilities and the terms in the binomial expansion of ( q + p ) 4 . Consider this expansion

( q + p ) 4 = q 4 + 4 q 3 p + 6 q 2 p 2 + 4 q p 3 + p 4

Then, for example, the term 4 p 3 q , is the probability of 3 successes in the four trials. These successes can occur anywhere in the four trials and there must be one failure hence the p 3 and q components which are multiplied together. The remaining part of this term, 4, is the number of ways of selecting three objects from 4.

Similarly there are 4 C 2 = 4 ! 2 ! 2 ! = 6 ways of selecting two objects from 4 so that the coefficient 6 combines with p 2 and q 2 to give the probability of two successes (and hence two failures) in four trials.

The approach described here can be extended for any number n of trials.

Key Point 4

The Binomial Probabilities Let X be a discrete random variable, being the number of successes occurring in n independent trials of an experiment. If X is to be described by the binomial model, the probability of exactly r successes in n trials is given by

P ( X = r ) = n C r p r q n r .

Here there are r successes (each with probability p ), n r failures (each with probability q ) and n C r = n ! r ! ( n r ) ! is the number of ways of placing the r successes among the n trials.

1.1 Notation

If a random variable X follows a binomial distribution in which an experiment is repeated n times each with probability p of success then we write X B ( n , p ) .

Example 9

A worn machine is known to produce 10% defective components. If the random variable X is the number of defective components produced in a run of 4 components, find the probabilities that X takes the values 0 to 4.

Solution

From Example 8, we know that the probabilities required are the terms of the expansion of the expression:

( 0.9 + 0.1 ) 4 so  X B ( 4 , 0.1 )

Hence the required probabilities are (using the general formula with n = 4 and p = 0.1 )

P ( X = 0 ) = ( 0.9 ) 4 = 0.6561

P ( X = 1 ) = 4 ( 0.9 ) 3 ( 0.1 ) = 0.2916

P ( X = 2 ) = 4 × 3 1 × 2 ( 0.9 ) 2 ( 0.1 ) 2 = 0.0486

P ( X = 3 ) = 4 × 3 × 2 1 × 2 × 3 ( 0.9 ) ( 0.1 ) 3 = 0.0036

P ( X = 4 ) = ( 0.1 ) 4 = 0.0001

Also, since we are using the expansion of ( 0.9 + 0.1 ) 4 , the probabilities should sum to 1, This is a useful check on your arithmetic when you are using a binomial distribution.

Example 10

In a box of switches it is known 10% of the switches are faulty. A technician is wiring 30 circuits, each of which needs one switch. What is the probability that

  1. all 30 work,
  2. at most 2 of the circuits do not work?
Solution

The answers involve binomial distributions because there are only two states for each circuit - it either works or it doesn’t work.

A trial is the operation of testing each circuit.

A success is that it works. We are given P (success) = p = 0.9

Also we have the number of trials n = 30

Applying the binomial distribution P ( X = r ) = n C r p r ( 1 p ) n r .

  1. Probability that all 30 work is P ( X = 30 ) = 30 C 30 ( 0.9 ) 30 ( 0.1 ) 0 = 0.04239
  2. The statement that “at most 2 circuits do not work” implies that 28, 29 or 30 work. That is X 28 P ( X 28 ) = P ( X = 28 ) + P ( X = 29 ) + P ( X = 30 ) P ( X = 30 ) = 30 C 30 ( 0.9 ) 30 ( 0.1 ) 0 = 0.04239 P ( X = 29 ) = 30 C 29 ( 0.9 ) 29 ( 0.1 ) 1 = 0.14130 P ( X = 28 ) = 30 C 28 ( 0.9 ) 28 ( 0.1 ) 2 = 0.22766

    Hence P ( X 28 ) = 0.41135

Example 11

A University Engineering Department has introduced a new software package called SOLVIT. To save money, the University’s Purchasing Department has negotiated a bargain price for a 4-user licence that allows only four students to use SOLVIT at any one time. It is estimated that this should allow 90% of students to use the package when they need it. The Students’ Union has asked for more licences to be bought since engineering students report having to queue excessively to use SOLVIT. As a result the Computer Centre monitors the use of the software. Their findings show that on average 20 students are logged on at peak times and 4 of these want to use SOLVIT. Was the Purchasing Department’s estimate correct?

Solution

P (student wanted to use SOLVIT ) = 4 20 = 0.2

Let X be the number of students wanting to use SOLVIT at any one time, then

P ( X = 0 ) = 20 C 0 ( 0.2 ) 0 ( 0.8 ) 20 = 0.0115 P ( X = 1 ) = 20 C 1 ( 0.2 ) 1 ( 0.8 ) 19 = 0.0576 P ( X = 2 ) = 20 C 2 ( 0.2 ) 2 ( 0.8 ) 18 = 0.1369 P ( X = 3 ) = 20 C 3 ( 0.2 ) 3 ( 0.8 ) 17 = 0.2054 P ( X = 4 ) = 20 C 4 ( 0.2 ) 4 ( 0.8 ) 16 = 0.2182

Therefore

P ( X 4 ) = P ( X = 0 ) + P ( X = 1 ) + P ( X = 2 ) + P ( X = 3 ) + P ( X = 4 ) = 0.01152 + 0.0576 + 0.1369 + 0.2054 + 0.2182 = 0.61862

The probability that more than 4 students will want to use SOLVIT is

P ( X > 4 ) = 1 P ( X 4 ) = 0.38138

That is, 38% of the time there will be more than 4 students wanting to use the software. The Purchasing Department has grossly overestimated the availability of the software on the basis of a 4-user licence.

Task!

Using the binomial model, and assuming that a success occurs with probability 1 5 in each trial, find the probability that in 6 trials there are

  1. 0 successes
  2. 3 successes
  3. 2 failures.

Let X be the number of successes in 6 independent trials.

In each case p = 1 5 and q = 1 p = 4 5 .

Here r = 0 and

P ( X = 0 ) = q 6 = 4 5 6 = 4096 15625 0.262

r = 3 and P ( X = 3 ) = 6 C 3 p 3 q 3 = 6 × 5 × 4 1 × 2 × 3 × 1 5 3 × 4 5 3 = 20 × 64 5 6 = 12 × 80 15625 = 0.0819

Here r = 4 and P ( X = 4 ) = 6 C 4 p 4 q 2 = 6 × 5 1 × 2 × 1 5 4 × 4 5 2 = 15 × 4 2 5 6 = 240 15625 = 0.01536