2 Expectation and variance of the binomial distribution

For a binomial distribution X B ( n , p ) , the mean and variance, as we shall see, have a simple form. While we will not prove the formulae in general terms - the algebra can be rather tedious - we will illustrate the results for cases involving small values of n .

The case n = 2

Essentially, we have a random variable X which follows a binomial distribution X B ( 2 , p ) so that the values taken by X (and X 2 - needed to calculate the variance) are shown in the following table:

  x     x 2    P ( X = x )     x P ( X = x )     x 2 P ( X = x )  
0 0 q 2 0 0
1 1 2 q p 2 q p 2 q p
2 4 p 2 2 p 2 4 p 2

We can now calculate the mean of this distribution:

E ( X ) = x P ( X = x ) = 0 + 2 q p + 2 p 2 = 2 p ( q + p ) = 2 p since q + p = 1

Similarly, the variance V ( X ) is given by

V ( X ) = E ( X 2 ) [ E ( X ) ] 2 = 0 + 2 q p + 4 p 2 ( 2 p ) 2 = 2 q p

Task!

Calculate the mean and variance of a random variable X which follows a binomial distribution X B ( 3 , p ) .

The table of values appropriate to this case is:

x x 2 P ( X = x ) x P ( X = x ) x 2 P ( X = x )
0 0 q 3 0 0
1 1 3 q 2 p 3 q 2 p 3 q 2 p
2 4 3 q p 2 6 q p 2 12 q p 2
3 9 p 3 3 p 3 9 p 3

Hence E ( X ) = x P ( X = x ) = 0 + 3 q 2 p + 6 q p 2 + 3 p 3 = 3 p ( q + p ) 2 = 3 p since q + p = 1

V ( X ) = E ( X 2 ) [ E ( X ) ] 2 = 0 + 3 q 2 p + 12 q p 2 + 9 p 3 ( 3 p ) 2 = 3 p ( q 2 + 4 q p + 3 p 2 3 p ) = 3 p ( ( 1 p ) 2 + 4 ( 1 p ) p + 3 p 2 3 p ) = 3 p ( 1 2 p + p 2 + 4 p 4 p 2 + 3 p 2 3 p ) = 3 p ( 1 p ) = 3 p q

From the results given above, it is reasonable to asert the following result in Key Point 5.

Key Point 5

Expectation and Variance of the Binomial Distribution

If a random variable X which can assume the values 0 , 1 , 2 , 3 , , n follows a binomial distribution X B ( n , p ) so that

P ( X = r ) = n C r p r q n r = n C r p r ( 1 p ) n r

then the expectation and variance of the distribution are given by the formulae

E ( X ) = n p and V ( X ) = n p ( 1 p ) = n p q

Task!

A die is thrown repeatedly 36 times in all. Find E ( X ) and V ( X ) where X is the number of sixes obtained.

Consider the occurrence of a six, with X being the number of sixes thrown in 36 trials.

The random variable X follows a binomial distribution. (Why? Refer to page 18 for the criteria if necessary). A trial is the operation of throwing a die. A success is the occurrence of a 6 on a particular trial, so p = 1 6 . We have n = 36 , p = 1 6 so that

E ( X ) = n p = 36 × 1 6 = 6 and V ( X ) = n p q = 36 × 1 6 × 5 6 = 5.

Hence the standard deviation is σ = 5 2 . 236 .

E ( X ) = 6 implies that in 36 throws of a fair die we would expect, on average, to see 6 sixes. This makes perfect sense, of course.

Exercises
  1. The probability that a mountain-bike rider travelling along a certain track will have a tyre burst is 0.05. Find the probability that among 17 riders:
    1. exactly one has a burst tyre
    2. at most three have a burst tyre
    3. two or more have burst tyres.
    1. A transmission channel transmits zeros and ones in strings of length 8, called ‘words’. Possible distortion may change a one to a zero or vice versa; assume this distortion occurs with probability .01 for each digit, independently. An error-correcting code is employed in the construction of the word such that the receiver can deduce the word correctly if at most one digit is in error. What is the probability the word is decoded incorrectly?
    2. Assume that a word is a sequence of 10 zeros or ones and, as before, the probability of incorrect transmission of a digit is .01. If the error-correcting code allows correct decoding of the word if no more than two digits are incorrect, compute the probability that the word is decoded correctly.
  2. An examination consists of 10 multi-choice questions, in each of which a candidate has to deduce which one of five suggested answers is correct. A completely unprepared student guesses each answer completely randomly. What is the probability that this student gets 8 or more questions correct? Draw the appropriate moral!
  3. The probability that a machine will produce all bolts in a production run within specification is 0.998. A sample of 8 machines is taken at random. Calculate the probability that
    1. all 8 machines, 
    2. 7 or 8 machines, 
    3. at least 6 machines

      will produce all bolts within specification

  4. The probability that a machine develops a fault within the first 3 years of use is 0.003. If 40 machines are selected at random, calculate the probability that 38 or more will not develop any faults within the first 3 years of use.
  5. A computer installation has 10 terminals. Independently, the probability that any one terminal will require attention during a week is 0.1. Find the probabilities that
    1. 0,
    2. 1,
    3. 2,
    4. 3 or more, terminals will require attention during the next week.
  6. The quality of electronic chips is checked by examining samples of 5. The frequency distribution of the number of defective chips per sample obtained when 100 samples have been examined is:

    No. of defectives 0 1 2 3 4 5
    No. of samples 47 34 16 3 0 0

    Calculate the proportion of defective chips in the 500 tested. Assuming that a binomial distribution holds, use this value to calculate the expected frequencies corresponding to the observed frequencies in the table.

  7. In a large school, 80% of the pupils like mathematics. A visitor to the school asks each of 4 pupils, chosen at random, whether they like mathematics.
    1. Calculate the probabilities of obtaining an answer yes from 0, 1, 2, 3, 4 of the pupils
    2. Find the probability that the visitor obtains the answer yes from at least 2 pupils:
      1. when the number of pupils questioned remains at 4
      2. when the number of pupils questioned is increased to 8.
  8. A machine has two drive belts, one on the left and one on the right. From time to time the drive belts break. When one breaks the machine is stopped and both belts are replaced. Details of n consecutive breakages are recorded. Assume that the left and right belts are equally likely to break first. Let X be the number of times the break is on the left.
    1. How many possible different sequences of “left” and “right” are there?
    2. How many of these sequences contain exactly j “lefts”?
    3. Find an expression, in terms of n and j , for the probability that X = j .
    4. Let n = 6. Find the probability distribution of X .
  9. A machine is built to make mass-produced items. Each item made by the machine has a probability p of being defective. Given the value of p , the items are independent of each other. Because of the way in which the machines are made, p could take one of several values. In fact p = X 100 where X has a discrete uniform distribution on the interval [ 0 , 5 ] . The machine is tested by counting the number of items made before a defective is produced. Find the conditional probability distribution of X given that the first defective item is the thirteenth to be made.
  10. Seven batches of articles are manufactured. Each batch contains ten articles. Each article has, independently, a probability of 0.1 of being defective. Find the probability that there is at least one defective article
    1. in exactly four of the batches,
    2. in four or more of the batches.
  11. A service engineer is can be called out for maintenance on the photocopiers in the offices of four large companies, A, B, C and D. On any given week there is a probability of 0.1 that he will be called to each of these companies. The event of being called to one company is independent of whether or not he is called to any of the others.
    1. Find the probability that, on a particular day,
      1. he is called to all four companies,
      2. he is called to at least three companies,
      3. he is called to all four given that he is called to at least one,
      4. he is called to all four given that he is called to Company A.
    2. Find the expected value and variance of the number of these companies which call the engineer on a given day.
  12. There are five machines in a factory. Of these machines, three are working properly and two are defective. Machines which are working properly produce articles each of which has independently a probability of 0.1 of being imperfect. For the defective machines this probability is 0.2. A machine is chosen at random and five articles produced by the machine are examined. What is the probability that the machine chosen is defective given that, of the five articles examined, two are imperfect and three are perfect?
  13. A company buys mass-produced articles from a supplier. Each article has a probability p of being defective, independently of other articles. If the articles are manufactured correctly then p = 0 . 05 . However, a cheaper method of manufacture can be used and this results in p = 0 . 1 .
    1. Find the probability of observing exactly three defectives in a sample of twenty articles
      1. given that p = 0.05
      2. given that p = 0 . 1 .
    2. The articles are made in large batches. Unfortunately batches made by both methods are stored together and are indistinguishable until tested, although all of the articles in any one batch will be made by the same method. Suppose that a batch delivered to the company has a probability of 0.7 of being made by the correct method. Find the conditional probability that such a batch is correctly manufactured given that, in a sample of twenty articles from the batch, there are exactly three defectives.
    3. The company can either accept or reject a batch. Rejecting a batch leads to a loss for the company of £150. Accepting a batch which was manufactured by the cheap method will lead to a loss for the company of £400. Accepting a batch which was correctly manufactured leads to a profit of £500. Determine a rule for what the company should do if a sample of twenty articles contains exactly three defectives, in order to maximise the expected value of the profit (where loss is negative profit). Should such a batch be accepted or rejected?
    4. Repeat the calculation for four defectives in a sample of twenty and hence, or otherwise, determine a rule for how the company should decide whether to accept or reject a batch according to the number of defectives.
  1. Binomial distribution P ( X = r ) = n C r p r ( 1 p ) n r where p is the probability of single ‘success’

    which is ‘tyre burst’.

    1. P ( X = 1 ) = 17 C 1 ( 0.05 ) 1 ( 0.95 ) 16 = 0.3741
    2. P ( X 3 ) = P ( X = 0 ) + P ( X = 1 ) + P ( X = 2 ) + P ( X = 3 ) = ( 0.95 ) 17 + 17 ( 0.05 ) ( 0.95 ) 16 + 17 × 16 2 × 1 ( 0.05 ) 2 ( 0.95 ) 15 + 17 × 16 × 15 3 × 2 × 1 ( 0.05 ) 3 ( 0.95 ) 14 = 0.9912
    3. P ( X 2 ) = 1 P [ ( X = 0 ) ( X = 1 ) ] = 1 ( 0.95 ) 17 17 ( 0.05 ) ( 0.95 ) 16 = 0.2077
    1. P (distortion) = 0.01 for each digit. This is a binomial situation in which the probability of ‘success’ is 0.01 = p and there are n = 8 trials.
      A word is decoded incorrectly if there are two or more digits in error P ( X 2 ) = 1 P [ ( X = 0 ) ( X = 1 ) ] = 1 8 C 0 ( 0.99 ) 8 8 C 1 ( 0.01 ) ( 0.99 ) 7 = 0.00269
    2. Same as (a) with n = 10 . Correct decoding if X 2 P ( X 2 ) = P [ ( X = 0 ) ( X = 1 ) ( X = 2 ) ] = ( 0.99 ) 10 + 10 ( 0.01 ) ( 0.99 ) 9 + 45 ( 0.01 ) 2 ( 0.99 ) 8 = 0.99989
  2. Let X be a random variable ‘number of answers guessed correctly’ then for each question

    (i.e. trial) the probability of a ‘success’ = 1 5 . It is clear that X follows a binomial distribution

    with n = 10 and p = 0.2 .

    P (randomly choosing correct answer) = 1 5 n = 10

    P ( 8 or more correct ) = P [ ( X = 8 ) ( X = 9 ) ( X = 10 ) ] = 10 C 8 ( 0.2 ) 8 ( 0.8 ) 2 + 10 C 9 ( 0.2 ) 9 ( 0.8 ) + 10 C 10 ( 0.2 ) 10 = 0.000078
    1. 0.9841
    2. 0.9999
    3. 1.0000
  3. P ( X 38 ) = P ( X = 38 ) + P ( X = 39 ) + P ( X = 40 ) = 0.00626 + 0.1067 + 0.88676 = 0.99975
    1. 0.3487
    2. 0.3874
    3. 0.1937
    4. 0.0702
  4. 0.15 (total defectives = 0 + 34 + 32 + 9 + 0 out of 500 tested); 44, 39, 14, 2, 0, 0
    1. 0.0016, 0.0256, 0.1536, 0.4096, 0.4096;
      1. 0.9728
      2. 0.9988
    1. There are 2 n possible sequences.
    2. The number containing exactly j “lefts” is n j .

       

    3. P ( X = j ) = n j 2 n .

       

    4. With n = 6 the distribution of X is
      j 0 1 2 3 4 5 6
      P ( X = j ) 0.015625 0.09375 0.234375 0.3125 0.234375 0.09375 0.015625
  5. Let Y be the number of the first defective item.

    P ( X = j Y = 13 ) = P ( X = j ) × P ( Y = 13 X = j ) i = 0 5 P ( X = i ) × P ( Y = 13 X = i ) = P ( Y = 13 X = j ) i = 0 5 P ( Y = 13 X = i )

    since P ( X = j ) = 1 6 for j = 0 , , 5.

    P ( Y = 13 X = j ) = 1 X 100 12 X 100

     

    j P ( Y = 13 X = j ) P ( X = j Y = 13 )
    0 0.00000 0.0000
    1 0.00886 0.0707
    2 0.01569 0.1251
    3 0.02082 0.1660
    4 0.02451 0.1954
    5 0.02702 0.2154
    6 0.02856 0.2277
    Total 0.12546 1
  6. The probability of at least one defective in a batch is 1 0 . 9 10 = 0 . 6513 .

    Let the probability of at least one defective in exactly j batches be p j .

    1. p 4 = 7 4 1 0 . 9 10 4 0 . 9 10 3 = 35 × 0.651 3 4 × 0.348 7 3 = 0 . 2670 .
    2. p 5 = 7 5 1 0 . 9 10 5 0 . 9 10 2 = 21 × 0.651 3 5 × 0.348 7 2 = 0 . 2993 . p 6 = 7 6 1 0 . 9 10 6 0 . 9 10 1 = 7 × 0.651 3 6 × 0.348 7 1 = 0 . 1863 . p 7 = 7 7 1 0 . 9 10 7 0 . 9 10 0 = 0.651 3 7 = 0 . 0497 .

      The probability of at least one defective in four or more of the batches is

      p 4 + p 5 + p 6 + p 7 = 0 . 8023 .

    1. Let Y be the number of companies to which the engineer is called and let A denote the event that the engineer is called to company A.
      1. P ( Y = 4 ) = 0 . 1 4 = 0 . 0001 .
      2. P ( Y 3 ) = 4 3 × 0 . 1 3 × 0 . 9 1 + 0 . 1 4 = 0 . 0037 .
      3. P ( Y = 4 Y 1 ) = P ( Y = 4 Y 1 ) P ( Y 1 )

        = P ( Y = 4 ) P ( Y 1 ) = 0.0001 1 0 . 9 4 = 0.0001 0.3439 = 1 3439 = 0 . 0003 .

      4. P ( Y = 4 A ) = P ( Y = 4 A ) P ( A )

        P ( Y = 4 ) P ( A ) = 0.0001 0.1 = 0 . 0010 .

    2. The mean is E ( Y ) = 4 × 0.1 = 0 . 4 . The variance is V ( Y ) = 4 × 0.1 × 0.9 = 0 . 36 .
  7. Let D denote the event that the chosen machine is defective and D ̄ denote the event

    “not D ”.

    Let Y be the number of imperfect articles in the sample of five.

    Then

    P ( D Y = 2 ) = P ( D ) × P ( Y = 2 D ) P ( D ) × P ( Y = 2 D ) + P ( D ̄ ) × P ( Y = 2 D ̄ ) = 2 5 × 5 2 × 0 . 2 2 × 0 . 8 3 2 5 × 5 2 × 0 . 2 2 × 0 . 8 3 + 3 5 × 5 2 × 0 . 1 2 × 0 . 9 3 = 2 × 0 . 2 2 × 0 . 8 3 2 × 0 . 2 2 × 0 . 8 3 + 3 × 0 . 1 2 × 0 . 9 3 = 0.04096 0.04096 + 0.02187 = 0 . 6519 .
      1. p 3 = 20 3 0 . 1 3 × 0 . 9 17 = 20 × 19 × 18 1 × 2 × 3 × 0 . 1 3 × 0 . 9 7 = 0 . 190 .
      2. p 2 = 20 2 0 . 1 2 × 0 . 9 18 = 3 18 × 9 × p 3 = 0.28518 p 1 = 20 1 0.1 × 0 . 9 19 = 2 19 × 9 × p 2 = 0.27017 p 0 = 20 0 0 . 9 20 = 0 . 12158 .

        The total probability is 0.867.

      3. The required probability is the probability of at most 2 out of 16. p 0 = P ( 0 out of 16 ) = 0 . 9 16 = 0.185302 p 1 = P ( 1 out of 16 ) = 16 9 × p 0 = 0.3294258 p 2 = P ( 2 out of 16 ) = 15 2 × 1 9 × p 1 = 0.2745215
    1. 0.2 4 1 × 0 . 3 1 × 0 . 7 3 0.2 4 1 × 0 . 3 1 × 0 . 7 3 + 0.9 4 1 × 0 . 1 1 × 0 . 9 3 = 0.02058 0.02058 + 0.05832 = 0 . 2608 .