1 Continuous probability distributions
In order to get a good understanding of continuous probability distributions it is advisable to start by answering some fairly obvious questions such as: “What is a continuous random variable?” “Is there any carry over from the work we have already done on discrete random variables and distributions?” We shall start with some basic concepts and definitions.
1.1 Continuous random variables
In day-to-day situations met by practising engineers, experiments such as measuring current in a piece of wire or measuring the dimensions of machined components play a part. However closely an engineer tries to control an experiment, there will always be small variations in the results obtained due to many factors: the influence of factors outside the control of the engineer. Such influences include changes in ambient temperature which may affect the accuracy of measuring devices used, slight variation in the chemical composition of the materials used to produce the objects (wire, machined components in this case) under investigation. In the case of machined components, many of the small variations seen in measurements may be due to the influence of vibration, cutting tool wear in the machine producing the component, changes in raw material used and the process used to refine it and even the measurement process itself!
Such variations (current and length for example) can be represented by a
random variable
and it is
customary to define an
interval
, finite or infinite, within which variation can take place. Since such a
variable (
say) can assume
any
value within an interval we say that the variable is
continuous
rather than
discrete. - its values form an entity we can think of as a continuum.
The following definition summarizes the situation.
1.2 Definition
A random variable is said to be continuous if it can assume any value in a given interval. This contrasts with the definition of a discrete random variable which can only assume discrete values.
1.3 Practical example
This example will help you to see how continuous random variables arise and will help you to distinguish between continuous and discrete random variables.
Consider a de-magnetised compass needle mounted at its centre so that it can spin freely. Its initial position is shown in Figure 1(a). It is spun clockwise and when it comes to rest the angle , from the vertical, is measured. See Figure 1(b).
Figure 1 (a) and (b)
Let be the random variable
“angle measured after each spin”
Firstly, note that is a random variable since it can take any value in the interval 0 to and we cannot be sure in advance which value it will take. However, after each spin and thinking in probability terms, there are certainly two distinct questions we can ask.
- What is the probability that lies between two values , i.e. what is ?
- What is the probability that assumes a particular value, say . We are really asking what is the value of ?
The first question is easy to answer provided we assume that the probability of the needle coming to rest in a given interval is given by the formula:
The following results are easily obtained and they clearly coincide with what we intuitively feel is correct:
- since the interval covers one quarter of a full circle
- since the interval covers three quarters of a full circle.
It is easy to see the generalization of this result for the interval , in which both , lie in the interval
The second question immediately presents problems! In order to answer a question of this kind would require a measuring device (e.g. a protractor) with infinite precision: no such device exists nor could one ever be constructed. Hence it can never be verified that the needle, after spinning, takes any particular value; all we can be reasonably sure of is that the needle lies between two particular values.
We conclude that in experiments of this kind we never determine the probability that the random variable assume a particular value but only calculate the probability that it lies within a given range of values. This kind of random variable is called a continuous random variable and it is characterised, not by probabilities of the type (as was the case with a discrete random variable), but by a function called the probability density function (pdf for short). In the case of the rotating needle this function takes the simple form given with corresponding plot:
Figure 2(a) and (b) :
The probability is the area under the function curve and so is given by the integral
Suppose we wanted to find . Then using the definition of the pdf for this case:
This is reasonable since the interval is one twenty-fourth of the interval 0 to .
In general terms we have
for the pdf under consideration here. Note also that
- , for all real
- i.e. total probability is .
We are now in a position to give a formal definition of a continuous random variable in Key Point 1.
Key Point 1
is said to be a continuous random variable if there exists a function associated with called the probability density function with the properties
- for all
The first two bullet points in Key Point 1 are the analogues of the results and for discrete random variables.
Task!
Which of the following are not probability density functions?
- (iii)
Check whether the first two statements in Key Point 1 are satisfied for each pdf above:
(i) We can write
for all but .
Thus this function is not a valid probability density function because the integral’s value is not 1.
(ii)
Note that for all
(Alternatively, the area of the triangle is )
This implies that is a valid probability density function.
(iii)
but for . Hence (iii) is not a pdf.
Task!
Find the probability that takes a value between and when the pdf is given by the following figure.
First find :
Also , so hence
State the formula for :
Write down an integral to represent . Use symmetry to evaluate the integral.
Analogous to the formula for the cumulative distribution function:
used in the case of a discrete random variable with associated probabilities , we define a cumulative probability distribution function by means of the integral (being a form of a sum):
The cdf represents the probability of observing a value less than or equal to .
Task!
For the pdf in the diagram below
obtain the cdf and verify the result obtained in the previous Task for .
Example 1
Traditional electric light bulbs are known to have a mean lifetime to failure of 2000 hours. It is also known that the distribution function of the time to failure takes the form
where is the mean time to failure. You will see if you study the topic of reliability in more detail that this is a realistic distribution function. The reliability function , giving the probability that the light bulb is still working at time , is defined as
Find the proportion of light bulbs that you would expect to fail before 1500 hours and the proportion you would expect to last longer than 2500 hours.
Solution
Let be the random variable ‘time to failure’.
The proportion of bulbs expected to fail before 1500 hours is given as
The proportion of bulbs expected to last longer that 2500 hours is given as
Using we have .
Hence we expect just under 53% of light bulbs to fail before 1500 hours service and just under 29% of light bulbs to give over 2500 hours service.
1.5 Mean and variance of a continuous distribution
You will probably have realised by now that, essentially, the definitions of discrete and continuous random variables are virtually the same provided we use the analogues given in the following table:
Quantity | Discrete Variable | Continuous Variable |
Probability | ||
Allowed Values | ||
Summation | ||
Expectation | ? | |
Variance | ? |
Completing the above table of analogues to write down the mean and variance of a continuous variable leads to the obvious definitions given in Key Point 2:
Key Point 2
Let be a continuous random variable with associated pdf . Then its expectation and variance denoted by (or ) and (or ) respectively are given by:
and
As with discrete random variables the variance can be written in an alternative form, more amenable to calculation:
where
Task!
For the variable with pdf
find and then .
First find :
Now find :
Now find :
Task!
The mileage (in 1000’s of miles) for which a certain type of tyre will last is a random variable with pdf
Find the probability that the tyre will last
- at most 10,000 miles;
- between 16,000 and 24,000 miles;
- at least 30,000 miles.
1.6 Important continuous distributions
There are a number of continuous distributions which have important applications in engineering and science. The areas of application and a little of the history (where appropriate) of the more important and useful distributions will be discussed in the later Sections and other Workbooks devoted to each of the distributions. Among the most important continuous probability distributions are:
-
the Uniform or Rectangular distribution, where the random variable
is restricted to a finite interval
and
has constant density often defined by a function of the form:
( HELM booklet 38.2)
-
the Exponential distribution defined by a probability density function of the form:
( HELM booklet 38.3)
-
the Normal distribution (often called the Gaussian distribution) where the random
variable
is defined by a probability density function of the form:
( HELM booklet 39)
-
the Weibull distribution where the random variable
is defined by a probability density function of the form:
( HELM booklet 46.1)
Exercises
- A target is made of three concentric circles of radii , 1 and metres. Shots within the inner circle count 4 points, within the middle band 3 points and within the outer band 2 points. (Shots outside the target count zero.) The distance of a shot from the centre of the target is a random variable with density function . Calculate the expected value of the score after five shots.
-
A continuous random variable
has the following probability density function.
Find
-
A continuous random variable
has the following probability density function
- Find
- Find the distribution function
- Find
- Find
- Find
- Find
- Find the distribution function of (Hint: For what values of is )
- Find the probability density function of
- Sketch
- Sketch
-
Let be the random variable equal to ‘score’.
0 2 3 4 The expected score after 5 shots is this value times 5 which is:
-
- so
-
-
-
- , so
- so
-
so
-
for
i.e.
Hence the distribution function of is
-
The pdf of
is
-
Sketch of pdf:
-
Sketch of distribution function: