4 Polynomial approximations - experimental data

You may well have experience in carrying out an experiment and then trying to get a straight line to pass as near as possible to the data plotted on graph paper. This process of adjusting a clear ruler over the page until it looks “about right" is fine for a rough approximation, but it is not especially scientific. Any software you use which provides a “best fit" straight line must obviously employ a less haphazard approach.

Here we show one way in which best fit straight lines may be found.

Best fit straight lines

Let us consider the situation mentioned above of trying to get a straight line y = m x + c to be as near as possible to experimental data in the form ( x 1 , f 1 ) , ( x 2 , f 2 ) , ( x 3 , f 3 ) , .

Figure 3

No alt text was set. Please request alt text from the person who provided you with this resource.

We want to minimise the overall distance between the crosses (the data points) and the straight line. There are a few different approaches, but the one we adopt here involves minimising the quantity

R = m x 1 + c f 1 vertical distance between line and the point  ( x 1 , f 1 ) 2 + m x 2 + c f 2 second data point distance 2 + m x 3 + c f 3 third data point distance 2 + = m x n + c f n 2 .

Each term in the sum measures the vertical distance between a data point and the straight line. (Squaring the distances ensures that distances above and below the line do not cancel each other out. It is because we are minimising the distances squared that the straight line we will find is called the least squares best fit straight line.) In order to minimise R we can imagine sliding the clear ruler around on the page until the line looks right; that is we can imagine varying the slope m and y -intercept c of the line. We therefore think of R as a function of the two variables m and c and, as we know from our earlier work on maxima and minima of functions, the minimisation is achieved when

R c = 0 and R m = 0.

(We know that this will correspond to a minimum because R has no maximum, for whatever value R takes we can always make it bigger by moving the line further away from the data points.)

Differentiating R with respect to m and c gives

R c = 2 m x 1 + c f 1 + 2 m x 2 + c f 2 + 2 m x 3 + c f 3 + = 2 m x n + c f n and R m = 2 m x 1 + c f 1 x 1 + 2 m x 2 + c f 2 x 2 + 2 m x 3 + c f 3 x 3 + = 2 m x n + c f n x n ,

respectively. Setting both of these quantities equal to zero (and cancelling the factor of 2) gives a pair of simultaneous equations for m and c . This pair of equations is given in the Key Point below.

Key Point 3

The least squares best fit straight line to the experimental data

( x 1 , f 1 ) , ( x 2 , f 2 ) , ( x 3 , f 3 ) , ( x n , f n )

is

y = m x + c

where m and c are found by solving the pair of equations

c 1 n 1 + m 1 n x n = 1 n f n , c 1 n x n + m 1 n x n 2 = 1 n x n f n .

(The term 1 n 1 is simply equal to the number of data points, n .)

Example 7

An experiment is carried out and the following data obtained:

x n 0.24 0.26 0.28 0.30 ̲ ̲ ̲ ̲ ̲ f n 1.25 0.80 0.66 0.20

Obtain the least squares best fit straight line, y = m x + c , to these data. Give c and m to 2 decimal places.

Solution

For a hand calculation, tabulating the data makes sense:

x n f n x n 2 x n f n ̲ ̲ ̲ ̲ ̲ 0.24 1.25 0.0576 0.3000 0.26 0.80 0.0676 0.2080 0.28 0.66 0.0784 0.1848 0.30 0.20 0.0900 0.0600 ̲ ̲ ̲ ̲ ̲ 1.08 2.91 0.2936 0.7528

The quantity 1 counts the number of data points and in this case is equal to 4.

It follows that the pair of equations for m and c are:

4 c + 1.08 m = 2.91 1.08 c + 0.2936 m = 0.7528

Solving these gives c = 5.17 and m = 16.45 and we see that the least squares best fit straight line to the given data is

y = 5.17 16.45 x

Figure 4 shows how well the straight line fits the experimental data.

Figure 4

No alt text was set. Please request alt text from the person who provided you with this resource.

Example 8

Find the best fit straight line to the following experimental data:

x n 0.00 1.00 2.00 3.00 4.00 ̲ ̲ ̲ ̲ ̲ ̲ f n 1.00 3.85 6.50 9.35 12.05

Solution

In order to work out all of the quantities appearing in the pair of equations we tabulate our calculations as follows

x n f n x n 2 x n f n ̲ ̲ ̲ ̲ ̲ 0.00 1.00 0.00 0.00 1.00 3.85 1.00 3.85 2.00 6.50 4.00 13.00 3.00 9.35 9.00 28.05 4.00 12.05 16.00 48.20 ̲ ̲ ̲ ̲ ̲ 10.00 32.75 30.00 93.10

The quantity 1 counts the number of data points and is in this case equal to 5 .

Hence our pair of equations is

5 c + 10 m = 32.95

10 c + 30 m = 93.10

Solving these equations gives c = 1.03 and m = 2.76 and this means that our best fit straight line to the given data is

y = 1.03 + 2.76 x

Task!

An experiment is carried out and the data obtained are as follows:

x n 0.2 0.3 0.5 0.9 ̲ ̲ ̲ ̲ ̲ f n 5.54 4.02 3.11 2.16

Obtain the least squares best fit straight line, y = m x + c , to these data. Give c and m to 2 decimal places.

Tabulating the data gives

x n f n x n 2 x n f n ̲ ̲ ̲ ̲ ̲ 0.2 5.54 0.04 1.108 0.3 4.02 0.09 1.206 0.5 3.11 0.25 1.555 0.9 2.16 0.81 1.944 ̲ ̲ ̲ ̲ ̲ 1.9 14.83 1.19 5.813

The quantity 1 counts the number of data points and in this case is equal to 4.

It follows that the pair of equations for m and c are:

4 c + 1.9 m = 14.83 1.9 c + 1.19 m = 5.813

Solving these gives c = 5.74 and m = 4.28 and we see that the least squares best fit straight line to the given data is

y = 5.74 4.28 x

Task!

Power output P of a semiconductor laser diode, operating at 3 5 C, as a function of the drive current I is measured to be

I 70 72 74 76 ̲ ̲ ̲ ̲ ̲ P 1.33 2.08 2.88 3.31

(Here I and P are measured in mA and mW respectively.)

It is known that, above a certain threshold current, the laser power increases linearly with drive current. Use the least squares approach to fit a straight line, P = m I + c , to these data. Give c and m to 2 decimal places.

Tabulating the data gives

I P I 2 I × P ̲ ̲ ̲ ̲ ̲ 70 1.33 4900 93.1 72 2.08 5184 149.76 74 2.88 5476 213.12 76 3.31 5776 251.56 ̲ ̲ ̲ ̲ ̲ 292 9.6 21336 707.54

The quantity 1 counts the number of data points and in this case is equal to 4.

It follows that the pair of equations for m and c are:

4 c + 292 m = 9.6 292 c + 21336 m = 707.54

Solving these gives c = 22.20 and m = 0.34 and we see that the least squares best fit straight line to the given data is

P = 22.20 + 0.34 I .