1 Continuous probability distributions

In order to get a good understanding of continuous probability distributions it is advisable to start by answering some fairly obvious questions such as: “What is a continuous random variable?” “Is there any carry over from the work we have already done on discrete random variables and distributions?” We shall start with some basic concepts and definitions.

1.1 Continuous random variables

In day-to-day situations met by practising engineers, experiments such as measuring current in a piece of wire or measuring the dimensions of machined components play a part. However closely an engineer tries to control an experiment, there will always be small variations in the results obtained due to many factors: the influence of factors outside the control of the engineer. Such influences include changes in ambient temperature which may affect the accuracy of measuring devices used, slight variation in the chemical composition of the materials used to produce the objects (wire, machined components in this case) under investigation. In the case of machined components, many of the small variations seen in measurements may be due to the influence of vibration, cutting tool wear in the machine producing the component, changes in raw material used and the process used to refine it and even the measurement process itself!

Such variations (current and length for example) can be represented by a random variable and it is customary to define an interval , finite or infinite, within which variation can take place. Since such a variable ( X say) can assume any value within an interval we say that the variable is continuous rather than discrete. - its values form an entity we can think of as a continuum.
The following definition summarizes the situation.

1.2 Definition

A random variable X is said to be continuous if it can assume any value in a given interval. This contrasts with the definition of a discrete random variable which can only assume discrete values.

1.3 Practical example

This example will help you to see how continuous random variables arise and will help you to distinguish between continuous and discrete random variables.

Consider a de-magnetised compass needle mounted at its centre so that it can spin freely. Its initial position is shown in Figure 1(a). It is spun clockwise and when it comes to rest the angle θ , from the vertical, is measured. See Figure 1(b).

Figure 1 (a) and (b)

No alt text was set. Please request alt text from the person who provided you with this resource.

Let X be the random variable

“angle θ measured after each spin”

Firstly, note that X is a random variable since it can take any value in the interval 0 to 2 π and we cannot be sure in advance which value it will take. However, after each spin and thinking in probability terms, there are certainly two distinct questions we can ask.

The first question is easy to answer provided we assume that the probability of the needle coming to rest in a given interval is given by the formula:

Probability = Given interval in radians Total interval in radians = Given interval in radians 2 π

The following results are easily obtained and they clearly coincide with what we intuitively feel is correct:

  1. P 0 < X < π 2 = 1 4 since the interval ( 0 , π 2 ) covers one quarter of a full circle
  2. P π 2 < X < 2 π = 3 4 since the interval ( π 2 , 2 π ) covers three quarters of a full circle.

It is easy to see the generalization of this result for the interval ( a , b ) , in which both a , b lie in the interval ( 0 , 2 π ) :

P ( a < X < b ) = b a 2 π

The second question immediately presents problems! In order to answer a question of this kind would require a measuring device (e.g. a protractor) with infinite precision: no such device exists nor could one ever be constructed. Hence it can never be verified that the needle, after spinning, takes any particular value; all we can be reasonably sure of is that the needle lies between two particular values.

We conclude that in experiments of this kind we never determine the probability that the random variable assume a particular value but only calculate the probability that it lies within a given range of values. This kind of random variable is called a continuous random variable and it is characterised, not by probabilities of the type P ( X = c ) (as was the case with a discrete random variable), but by a function f ( x ) called the probability density function (pdf for short). In the case of the rotating needle this function takes the simple form given with corresponding plot:

f ( x ) = 1 2 π , 0 x < 2 π 0 , elsewhere

Figure 2(a) and (b) :

{(a) Definition of the pdf (b) Plot of the pdf}

The probability P ( a < X < b ) is the area under the function curve f ( x ) and so is given by the integral

a b f ( x ) d x

Suppose we wanted to find P π 6 < X < π 4 . Then using the definition of the pdf for this case:

P π 6 < X < π 4 = π 6 π 4 1 2 π d x = x 2 π π 6 π 4 = 1 2 π π 4 π 6 = 1 2 π × π 12 = 1 24

This is reasonable since the interval ( π 6 , π 4 ) is one twenty-fourth of the interval 0 to 2 π .

In general terms we have

P ( a < X < b ) = a b f ( t ) d t = F ( b ) F ( a ) = b a 2 π

for the pdf under consideration here. Note also that

  1. f ( x ) 0 , for all real x
  2. f ( x ) d x = 0 2 π 1 2 π d x = 1 , i.e. total probability is 1 .

We are now in a position to give a formal definition of a continuous random variable in Key Point 1.

Key Point 1

X is said to be a continuous random variable if there exists a function f ( x ) associated with X called the probability density function with the properties

  • f ( x ) 0 for all x
  • f ( x ) d x = 1
  • P ( a < X < b ) = a b f ( x ) d x = F ( b ) F ( a )

The first two bullet points in Key Point 1 are the analogues of the results P ( X = x i ) 0 and i P ( X = x i ) = 1 for discrete random variables.

Task!

Which of the following are not probability density functions?

No alt text was set. Please request alt text from the person who provided you with this resource.

(iii)
f ( x ) = x 2 4 x + 10 3 , 0 x 3 0 , elsewhere

Check whether the first two statements in Key Point 1 are satisfied for each pdf above:

(i) We can write f ( x ) = x , 0 x 1 0 , elsewhere

f ( x ) 0 for all x but f ( x ) d x = 0 1 x d x = 1 2 1 .

Thus this function is not a valid probability density function because the integral’s value is not 1.

(ii)

Note that f ( x ) = 1 1 2 x , 0 x 2 0 , elsewhere f ( x ) 0 for all x

f ( x ) d x = 0 2 1 1 2 x d x = x x 2 4 0 2 = 2 1 = 1

(Alternatively, the area of the triangle is 1 2 × 1 × 2 = 1 )

This implies that f ( x ) is a valid probability density function.

(iii)

f ( x ) d x = 0 3 x 2 4 x + 10 3 d x = x 3 3 2 x 2 + 10 3 x 0 3 = ( 9 18 + 10 ) = 1

but f ( x ) < 0 for 1 x 3 . Hence (iii) is not a pdf.

Task!

Find the probability that X takes a value between 1 and 1 when the pdf is given by the following figure.

No alt text was set. Please request alt text from the person who provided you with this resource.

First find k :

f ( x ) d x = area under curve = area of triangle = 1 2 × 4 × k = 2 k

Also f ( x ) d x = 1 , so 2 k = 1 hence k = 1 2

State the formula for f ( x ) :

f ( x ) = 1 2 1 4 x , 0 x 2 1 2 + 1 4 x , 2 x < 0 0 , elsewhere .

Write down an integral to represent P ( 1 < X < 1 ) . Use symmetry to evaluate the integral.

1 1 f ( x ) d x = 2 0 1 1 2 1 4 x d x = 2 1 2 x 1 8 x 2 0 1 = 2 1 2 1 8 = 3 4

1.4 The cumulative distribution function

Analogous to the formula for the cumulative distribution function:

F ( x ) = x i x P ( X = x i )

used in the case of a discrete random variable X with associated probabilities P ( X = x i ) , we define a cumulative probability distribution function F ( x ) by means of the integral (being a form of a sum):

F ( x ) = x f ( t ) d t

The cdf represents the probability of observing a value less than or equal to x .

Task!

For the pdf in the diagram below

No alt text was set. Please request alt text from the person who provided you with this resource.

obtain the cdf and verify the result obtained in the previous Task for P 1 X 1 .

F ( x ) = 0 , x 2 1 2 + 1 2 x + 1 8 x 2 2 < x < 0 1 2 + 1 2 x 1 8 x 2 0 < x 2 1 x 2

P ( 1 x 1 ) = F ( 1 ) F ( 1 ) = 1 2 + 1 2 = 1 8 1 2 1 4 + 1 8 = 1 2 + 1 2 1 8 1 8 = 3 4 .
Example 1

Traditional electric light bulbs are known to have a mean lifetime to failure of 2000 hours. It is also known that the distribution function p ( t ) of the time to failure takes the form

p ( t ) = 1 e t μ

where μ is the mean time to failure. You will see if you study the topic of reliability in more detail that this is a realistic distribution function. The reliability function r ( t ) , giving the probability that the light bulb is still working at time t , is defined as

r ( t ) = 1 p ( t ) = e t μ

Find the proportion of light bulbs that you would expect to fail before 1500 hours and the proportion you would expect to last longer than 2500 hours.

Solution

Let T be the random variable ‘time to failure’.

The proportion of bulbs expected to fail before 1500 hours is given as

P ( T < 1500 ) = 1 e 1500 2000 = 1 e 3 4 = 1 0.4724 = 0.5276

The proportion of bulbs expected to last longer that 2500 hours is given as

P ( T > 2500 ) = 1 P ( T 2500 ) = e 2500 2000 = e 5 4 = 0 . 2865 .

Using r ( t ) = 1 p ( t ) we have r ( 2500 ) = 0.2865 .

Hence we expect just under 53% of light bulbs to fail before 1500 hours service and just under 29% of light bulbs to give over 2500 hours service.

1.5 Mean and variance of a continuous distribution

You will probably have realised by now that, essentially, the definitions of discrete and continuous random variables are virtually the same provided we use the analogues given in the following table:

Quantity Discrete Variable Continuous Variable
Probability P ( X = x ) f ( x ) d x
Allowed Values P ( X = x ) 0 f ( x ) 0
Summation P ( X = x ) f ( x ) d x
Expectation E ( X ) = x P ( X = x ) ?
Variance V ( X ) = ( x μ ) 2 P ( X = x ) ?

Completing the above table of analogues to write down the mean and variance of a continuous variable leads to the obvious definitions given in Key Point 2:

Key Point 2

Let X be a continuous random variable with associated pdf f ( x ) . Then its expectation and variance denoted by E ( X ) (or μ ) and V ( X ) (or σ 2 ) respectively are given by:

μ = E ( X ) = x f ( x ) d x

and

σ 2 = V ( X ) = ( x μ ) 2 f ( x ) d x = x 2 f ( x ) d x μ 2

As with discrete random variables the variance V ( X ) can be written in an alternative form, more amenable to calculation:

V ( X ) = E ( X 2 ) [ E ( X ) ] 2

where E ( X 2 ) = x 2 f ( x ) d x .

Task!

For the variable X with pdf

f ( x ) = 1 2 x , 0 x 2 0 , elsewhere

find E ( X ) and then V ( X ) .

First find E ( X ) :

E ( X ) = 0 2 1 2 x . x d x = 1 6 x 3 0 2 = 8 6 = 4 3 .

Now find E ( X 2 ) :

E ( X 2 ) = 0 2 1 2 x . x 2 d x = 1 8 x 4 0 2 = 2.

Now find V ( X ) :

V ( X ) = E ( X 2 ) E ( X ) 2 = 2 16 9 = 2 9 .
Task!

The mileage (in 1000’s of miles) for which a certain type of tyre will last is a random variable with pdf

f ( x ) = 1 20 e x 20 , for all x > 0 0 for all x < 0

Find the probability that the tyre will last

  1. at most 10,000 miles;
  2. between 16,000 and 24,000 miles;
  3. at least 30,000 miles.
  1. P ( a < X < b ) = a b f ( x ) d x P ( X < 10 ) = 10 f ( x ) d x = 0 10 1 20 e x 20 d x = e x 20 0 10 = 0.393
  2. P ( 16 < X < 24 ) = 16 24 1 20 e x 20 d x = e x 20 16 24 = e 1.2 + e 0.8 = 0.148
  3. P ( X > 30 ) = 30 1 20 e x 20 d x = e x 20 30 = e 1.5 = 0.223

1.6 Important continuous distributions

There are a number of continuous distributions which have important applications in engineering and science. The areas of application and a little of the history (where appropriate) of the more important and useful distributions will be discussed in the later Sections and other Workbooks devoted to each of the distributions. Among the most important continuous probability distributions are:

  1. the Uniform or Rectangular distribution, where the random variable X is restricted to a finite interval [ a , b ] and f ( x ) has constant density often defined by a function of the form:

    f ( x ) = 1 b a , a x b 0 otherwise

    ( HELM booklet  38.2)

  2. the Exponential distribution defined by a probability density function of the form:

    f ( t ) = λ e λ t λ is a given constant

    ( HELM booklet  38.3)

  3. the Normal distribution (often called the Gaussian distribution) where the random variable X is defined by a probability density function of the form:

    f ( x ) = 1 σ 2 π e ( x μ ) 2 2 σ 2 μ , σ are given constants

    ( HELM booklet  39)

  4. the Weibull distribution where the random variable X is defined by a probability density function of the form:

    f ( x ) = α β ( α x ) β 1 e ( α x ) β α , β are given constants

    ( HELM booklet  46.1)

Exercises
  1. A target is made of three concentric circles of radii 1 3 , 1 and 3 metres. Shots within the inner circle count 4 points, within the middle band 3 points and within the outer band 2 points. (Shots outside the target count zero.) The distance of a shot from the centre of the target is a random variable R with density function f ( r ) = 2 π ( 1 + r 2 ) , r > 0 . Calculate the expected value of the score after five shots.
  2. A continuous random variable T has the following probability density function.

    f T ( u ) = 0 ( u < 0 ) 3 ( 1 u k ) ( 0 u k ) 0 ( u > k ) .

    Find

    1. k .
    2. E ( T ) .
    3. E ( T 2 ) .
    4. V ( T ) .
  3. A continuous random variable X has the following probability density function

    f X ( u ) = 0 ( u < 0 ) k u ( 0 u 1 ) 0 ( u > 1 )

    1. Find k .
    2. Find the distribution function F X ( u ) .
    3. Find E ( X ) .
    4. Find V ( X ) .
    5. Find E ( e X ) .
    6. Find V ( e X ) .
    7. Find the distribution function of e X . (Hint: For what values of X is e X < u ? )
    8. Find the probability density function of e X .
    9. Sketch f X ( u ) .
    10. Sketch F X ( u ) .
  1. P ( inner circle ) = P 0 < r < 1 3 = 0 1 3 2 π ( 1 + r 2 ) d r = 2 π tan 1 r 0 1 3 = 2 π tan 1 1 3 = 2 π π 6 = 1 3

    P ( middle  band ) = P 1 3 < r < 1 = 1 3 2 π ( 1 + r 2 ) d r = 2 π tan 1 r 1 3 1 = 2 π tan 1 1 1 3 = 1 6 .

    P ( outer band ) = P ( 1 < r < 3 ) = 2 π tan 1 r 1 3 = 2 π tan 1 3 1 2 = 1 6 P ( miss target ) = 1 1 6 1 6 1 3 = 1 3

    Let S be the random variable equal to ‘score’.

    s 0 2 3 4
    P ( S = s ) 1 3 1 6 1 6 1 3

    E ( S ) = 0 + 2 6 + 3 6 + 4 3 = 13 6

    The expected score after 5 shots is this value times 5 which is: = 5 13 6 = 10 . 83 .

    1. 1 = 0 k 3 ( 1 u k ) d u = 3 u u 2 2 k 0 k = 3 ( k k 2 ) so k = 2 3.
    2. E ( T ) = 0 2 3 3 u ( 1 3 u 2 ) d u = 3 0 2 3 u 3 u 2 2 d u

      3 u 2 2 u 3 2 0 2 3 = 3 2 9 4 27 = 3 6 4 27 = 2 9 .

    3. E ( T 2 ) = 0 2 3 3 u 2 ( 1 3 u 2 ) d u = 3 0 2 3 u 2 3 u 3 2 d u

      = 3 u 3 3 3 u 4 8 0 2 3 = 3 8 81 6 81 = 3 8 6 81 = 2 27

    4. V ( T ) = E ( T 2 ) { E ( T ) } 2 = 2 27 4 81 = 2 81 .
    1. 1 = 0 1 k u d u = k u 2 2 0 1 = k 2 , so k = 2.
    2. F X ( u ) = 0 ( u < 0 ) u 2 ( 0 u 1 ) 1 ( 1 < u )
    3. E ( X ) = 0 1 2 u 2 d u = 2 u 3 3 0 1 = 2 3 .
    4. E ( X 2 ) = 0 1 2 u 3 d u = 2 u 4 4 0 1 = 1 2 . so V ( X ) = E ( X 2 ) { E ( X ) } 2 = 1 2 4 9 = 1 18 .
    5. E ( e X ) = 0 1 2 u e u d u = 2 u e u 0 1 2 0 1 e u d u = 2 u e u 2 e u 0 1 = 2 e 2 e + 2 = 2
    6. E ( e 2 X ) = 0 1 2 u e 2 u d u = u e 2 u 0 1 0 1 e 2 u d u = u e 2 u e 2 u 2 0 1 = e 2

      = e 2 2 + 1 2 = ( e 2 + 1 ) 2 so V ( e X ) = E ( e 2 X ) { E ( e X ) } 2 = ( e 2 + 1 ) 2 4.

    7. P ( e X < u ) = P ( X < ln u ) = ( ln u ) 2 for 0 < ln u < 1 , i.e.  1 < u < e .

      Hence the distribution function of e X is F e X ( u ) = 0 ( u < 1 ) ( ln u ) 2 ( u e ) 1 ( e < u )

    8. The pdf of e X is f e X ( u ) = 0 ( u < 1 ) 2 ln u u ( u e ) 0 ( e < u )
    9. Sketch of pdf:

      PICT

    10. Sketch of distribution function:

      PICT