1 Interval estimation for the variance
In Section 40.1 we saw how to find a confidence interval for the mean of a normal population. We can also find a confidence interval for the variance. The corresponding confidence interval for the standard deviation is found by taking square roots.
We know that if we take samples from a population, then each sample will have a mean and a variance associated with it. We can calculate the values of these quantities from first principles, that is we can use the basic definitions of the mean and the variance to find their values. Just as the means form a distribution, so do the values of the variance and it is to this distribution that we turn in order to find an interval estimate for the value of the variance of the population. Note that if the original population is normal, samples taken from this population have means which are normally distributed. When we consider the distribution of variances calculated from the samples we need the chi-squared (usually written as ) distribution in order to calculate the confidence intervals. As you might expect, the values of the chi-squared distribution are tabulated for ease of use. The calculation of confidence intervals for the variance (and standard deviation) depends on the following result.
Key Point 2
If is a random sample taken from a normal population with mean and variance then if the sample variance is denoted by , the random variable
Clearly, a little explanation is required to make this understandable! Key Point 2 refers to the chi-squared distribution and the term ‘degrees of freedom.’ Both require some detailed explanation before the Key Point can be properly understood. We shall start by looking in a little detail at the chi-squared distribution and then consider the term ‘degrees of freedom.’ You are advised to read these explanations very carefully and make sure that you fully understand them.
1.1 The chi-squared random variable
The probability density function of a random variable is somewhat complicated and involves the gamma ( ) function. The gamma function, for positive , is defined as
It is easily shown that and that, if is an integer, then
The probability density function is
The plots in Figure 2 show the probability density function for various convenient values of . We have deliberately taken even values of so that the gamma function has a value easily calculated from the above formula for a factorial. In these graphs the vertical scaling has been chosen to ensure each graph has the same maximum value.
It is possible to discern two things from the diagrams.
Firstly, as increases, the peak of each curve occurs at values closer to . Secondly, as increases, the shape of the curve appears to become more and more symmetrical. In fact the mean of the distribution is and in the limit as the distribution becomes normal. One further fact, not obvious from the diagrams, is that the variance of the distribution is .
Figure 2
A summary is given in the following Key Point.
Key Point 3
The distribution, defined by the probability density function
1.2 Degrees of freedom
A formal definition of the term ‘degrees of freedom’ is that it is the ‘number of independent comparisons that can be made among the elements of a sample.’ Textbooks on statistics e.g. Applied Statistics and Probability for Engineers by Montgomery and Runger (Wiley) often give this formal definition. The number of degrees of freedom is usually represented by the Greek symbol pronounced ‘nu’. The following explanations of the concept should be helpful.
Explanation 1
If we have a sample of values say chosen from a population and we are trying to calculate the mean of the sample, we know that the sum of the deviations about the mean must be zero. Hence, the following constraint must apply to the observations.
Once we calculate the values of we can calculate the value of by using the constraint We say that we have degrees of freedom. The term ‘degrees of freedom’ may be thought of as the number of independent variables minus the number of constraints imposed.
Explanation 2
A point in space which can move freely has three degrees of freedom since it can move independently in the and directions. If we now restrict the point so that it can only move along the straight line
then we have effectively imposed two constraints since the value of (say) determines the values of and . In this situation, we say that the number of degrees of freedom is reduced from 3 to 1. That is, we have one degree of freedom.
A similar argument may be used to demonstrate that a point in three dimensional space which is restricted to move in a plane leads to a situation with two degrees of freedom.
Key Point 4
The term ‘degrees of freedom’ may be thought of as the number of independent variables involved minus the number of constraints imposed.
Figure 3 shows a typical distribution and Table 1 at the end of this Workbook show the values of for a variety of values of the area and the number of degrees of freedom . Notice that Table 1 gives the area values corresponding to the right-hand tail of the distribution which is shown shaded.
Figure 3
The values for (say) right-hand area values of 5% are given by the column headed 0.05 while the values for (say) left-hand area values of 5% are given by the column headed 0.95. Figure 4 shows the values of for the two 5% tails when there are 5 degrees of freedom.
Figure 4
Task!
Use the percentage points of the distribution to find the appropriate values of in the following cases.
- Right-hand tail of 10% and 7 degrees of freedom.
- Left-hand tail of 2.5% and 9 degrees of freedom.
- Both tails of 5% and 10 degrees of freedom.
- Both tails of 2.5% and 20 degrees of freedom.
Using Table 1 and reading off the values directly gives:
- 12.02
- 2.70
- 3.94 and 18.31
- 9.59 and 34.17
1.3 Constructing a confidence interval for the variance
We know that if is a random sample taken from a normal population with mean and variance and if the sample variance is denoted by , the random variable
has a chi-squared distribution with degrees of freedom. This knowledge enables us to construct a confidence interval as follows.
Firstly, we decide on a level of confidence, say, for the sake of illustration, 95%. This means that we need two 2.5% tails.
Secondly, we know that we have degrees of freedom so that the value of will lie between the left-tail value of and the right-tail value of . If we know the value of then we can easily read off these values from the tables.
The confidence interval is developed as shown below.
We have
so that
hence
so that
Another way of stating the same result using probability directly is to say that
Noting that and that we are working with the right-hand tail values of the distribution, it is usual to generalize the above result as follows. Taking a general confidence level as , (a 95% interval gives ), our confidence interval becomes
Note that the confidence interval for the standard deviation is obtained by taking the appropriate square roots.
The following Key Point summarizes the development of this confidence interval.
Key Point 5
If is a random sample with variance taken from a normal population with variance then a confidence interval for is
Example 2
A random sample of 20 nominally measured 2mm diameter steel ball bearings is taken and the diameters are measured precisely. The measurements, in mm, are as follows:
2.02 | 1.94 | 2.09 | 1.95 | 1.98 | 2.00 | 2.03 | 2.04 | 2.08 | 2.07 |
1.99 | 1.96 | 1.99 | 1.95 | 1.99 | 1.99 | 2.03 | 2.05 | 2.01 | 2.03 |
Assuming that the diameters are normally distributed with unknown mean, , and unknown variance ,
- find a two-sided 95% confidence interval for the variance, ;
- find a two-sided confidence interval for the standard deviation, .
Solution
From the data, we calculate and . Hence
There are 19 degrees of freedom and the critical values of the -distribution are
-
the confidence interval for
is
-
the confidence interval for
is
Task!
In a typical car, bell housings are bolted to crankcase castings by means of a series of 13 mm bolts. A random sample of 12 bolt-hole diameters is checked as part of a quality control process and found to have a variance of .
- Construct the 95% confidence interval for the variance of the holes.
- Find the 95% confidence interval for the standard deviation of the holes.
State clearly any assumptions you make.
Using the confidence interval formula developed, we know that the 95% confidence interval is
- The 95% confidence interval for the variance is .
- The 95% confidence interval for the standard deviation is mm.
We have assumed that the hole diameters are normally distributed.
Exercises
-
Measurements are made on the lengths, in mm, of a sample of twenty wooden
components for self-assembly furniture. Assume that these may be regarded as
twenty independent observations from a normal distribution with unknown mean
and unknown
variance
The data are as follows.
581 580 581 577 580 581 577 579 579 578 581 583 577 578 582 581 582 580 582 579 Find a 95% confidence interval for the variance and hence find a 95% confidence interval for the standard deviation
- A machine fills packets with powder. At intervals a sample of ten packets is taken and the packets are weighed. The ten weights may be regarded as a sample of ten independent observations from a normal distribution with unknown mean. Find limits such that the probability that is 0.9 when the population variance is and is the sample variance.
-
From the data we calculate
and
and we have
Hence
The number of degrees of freedom is We know that
with probability 0.95. So a 95% confidence interval for is
That is so
This gives a 95% confidence interval for :
-
There are
degrees of freedom. Now
Hence and
0.995 | 0.990 | 0.975 | 0.950 | 0.900 | 0.500 | 0.100 | 0.050 | 0.025 | 0.010 | 0.005 | |
1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.45 | 2.71 | 3.84 | 5.02 | 6.63 | 7.88 |
2 | 0.01 | 0.02 | 0.05 | 0.01 | 0.21 | 1.39 | 4.61 | 5.99 | 7.38 | 9.21 | 10.60 |
3 | 0.07 | 0.11 | 0.22 | 0.35 | 0.58 | 2.37 | 6.25 | 7.81 | 9.35 | 11.34 | 12.28 |
4 | 0.21 | 0.30 | 0.48 | 0.71 | 1.06 | 3.36 | 7.78 | 9.49 | 11.14 | 13.28 | 14.86 |
5 | 0.41 | 0.55 | 0.83 | 1.15 | 1.61 | 4.35 | 9.24 | 11.07 | 12.83 | 15.09 | 16.75 |
6 | 0.68 | 0.87 | 1.24 | 1.64 | 2.20 | 5.35 | 10.65 | 12.59 | 14.45 | 16.81 | 18.55 |
7 | 0.99 | 1.24 | 1.69 | 2.17 | 2.83 | 6.35 | 12.02 | 14.07 | 16.01 | 18.48 | 20.28 |
8 | 1.34 | 1.65 | 2.18 | 2.73 | 3.49 | 7.34 | 13.36 | 15.51 | 17.53 | 20.09 | 21.96 |
9 | 1.73 | 2.09 | 2.70 | 3.33 | 4.17 | 8.34 | 14.68 | 16.92 | 19.02 | 21.67 | 23.59 |
10 | 2.16 | 2.56 | 3.25 | 3.94 | 4.87 | 9.34 | 15.99 | 18.31 | 20.48 | 23.21 | 25.19 |
11 | 2.60 | 3.05 | 3.82 | 4.57 | 5.58 | 10.34 | 17.28 | 19.68 | 21.92 | 24.72 | 26.76 |
12 | 3.07 | 3.57 | 4.40 | 5.23 | 6.30 | 11.34 | 18.55 | 21.03 | 23.34 | 26.22 | 28.30 |
13 | 3.57 | 4.11 | 5.01 | 5.89 | 7.04 | 12.34 | 19.81 | 22.36 | 24.74 | 27.69 | 29.82 |
14 | 4.07 | 4.66 | 5.63 | 6.57 | 7.79 | 13.34 | 21.06 | 23.68 | 26.12 | 29.14 | 31.32 |
15 | 4.60 | 5.23 | 6.27 | 7.26 | 8.55 | 14.34 | 22.31 | 25.00 | 27.49 | 30.58 | 32.80 |
16 | 5.14 | 5.81 | 6.91 | 7.96 | 9.31 | 15.34 | 23.54 | 26.30 | 28.85 | 31.00 | 34.27 |
17 | 5.70 | 6.41 | 7.56 | 8.67 | 10.09 | 16.34 | 24.77 | 27.59 | 30.19 | 33.41 | 35.72 |
18 | 6.26 | 7.01 | 8.23 | 9.39 | 10.87 | 17.34 | 25.99 | 28.87 | 31.53 | 34.81 | 37.16 |
19 | 6.84 | 7.63 | 8.91 | 10.12 | 11.65 | 18.34 | 27.20 | 30.14 | 32.85 | 36.19 | 38.58 |
20 | 7.43 | 8.26 | 9.59 | 10.85 | 12.44 | 19.34 | 28.41 | 31.41 | 34.17 | 37.57 | 40.00 |
21 | 8.03 | 8.90 | 10.28 | 11.59 | 13.24 | 20.34 | 29.62 | 32.67 | 35.48 | 38.93 | 41.40 |
22 | 8.64 | 9.54 | 10.98 | 12.34 | 14.04 | 21.34 | 30.81 | 33.92 | 36.78 | 40.29 | 42.80 |
23 | 9.26 | 10.20 | 11.69 | 13.09 | 14.85 | 22.34 | 32.01 | 35.17 | 38.08 | 41.64 | 44.18 |
24 | 9.89 | 10.86 | 12.40 | 13.85 | 15.66 | 23.34 | 33.20 | 36.42 | 39.36 | 42.98 | 45.56 |
25 | 10.52 | 11.52 | 13.12 | 14.61 | 16.47 | 24.34 | 34.28 | 37.65 | 40.65 | 44.31 | 46.93 |
26 | 11.16 | 12.20 | 13.84 | 15.38 | 17.29 | 25.34 | 35.56 | 38.89 | 41.92 | 45.64 | 48.29 |
27 | 11.81 | 12.88 | 14.57 | 16.15 | 18.11 | 26.34 | 36.74 | 40.11 | 43.19 | 46.96 | 49.65 |
28 | 12.46 | 13.57 | 15.31 | 16.93 | 18.94 | 27.34 | 37.92 | 41.34 | 44.46 | 48.28 | 50.99 |
29 | 13.12 | 14.26 | 16.05 | 17.71 | 19.77 | 28.34 | 39.09 | 42.56 | 45.72 | 49.59 | 52.34 |
30 | 13.79 | 14.95 | 16.79 | 18.49 | 20.60 | 29.34 | 40.26 | 43.77 | 46.98 | 50.89 | 53.67 |
40 | 20.71 | 22.16 | 24.43 | 26.51 | 29.05 | 39.34 | 51.81 | 55.76 | 59.34 | 63.69 | 66.77 |
50 | 27.99 | 29.71 | 32.36 | 34.76 | 37.69 | 49.33 | 63.17 | 67.50 | 71.42 | 76.15 | 79.49 |
60 | 35.53 | 37.48 | 40.48 | 43.19 | 46.46 | 59.33 | 74.40 | 79.08 | 83.30 | 88.38 | 91.95 |
70 | 43.28 | 45.44 | 48.76 | 51.74 | 55.33 | 69.33 | 85.53 | 90.53 | 95.02 | 100.42 | 104.22 |
80 | 51.17 | 53.54 | 57.15 | 60.39 | 64.28 | 79.33 | 96.58 | 101.88 | 106.63 | 112.33 | 116.32 |
90 | 59.20 | 61.75 | 65.65 | 69.13 | 73.29 | 89.33 | 107.57 | 113.14 | 118.14 | 124.12 | 128.30 |
100 | 67.33 | 70.06 | 74.22 | 77.93 | 82.36 | 99.33 | 118.50 | 124.34 | 129.56 | 135.81 | 140.17 |