2 Two-way ANOVA with interaction

The previous subsection looked at two-way ANOVA under the assumption that there was no interaction between the factors A and B . We will now look at the developments of two-way ANOVA to take into account possible interaction between the factors under consideration. The following analysis allows us to test to see whether we have sufficient evidence to reject the null hypothesis that the amount of interaction is effectively zero.

To see how we might consider interaction between factors A and B taking place, look at the following table which represents observations involving a two-factor experiment.

Factor B
Factor A   1     2     3     4     5
1 3 5 1 9 12
2 4 6 2 10 13
3 6 8 4 12 15

A brief inspection of the numbers in the five columns reveals that there is a constant difference between any two rows as we move from column to column. Similarly there is a constant difference between any two columns as we move from row to row. While the data are clearly contrived, it does illustrate that in this case that no interaction arises from variations in the differences between either rows or columns. Real data do not exhibit such behaviour in general of course, and we expect differences to occur and so we must check to see if the differences are large enough to provide sufficient evidence to reject the null hypothesis that the amount of interaction is effectively zero.

Notation

Let a represent the number of ‘levels’ present for factor A , denoted i = 1 , , a .

Let b represent the number of ‘levels’ present for factor B , denoted j = 1 , , b .

Let n represent the number of observations per cell. We assume that it is the same for each cell.

In the table above, a = 3 , b = 5 , n = 1 . In the examples we shall consider, n will be greater than 1 and we will be able to check for interaction between the factors.

We suppose that the observations at level i of factor A and level j of factor B are taken from a normal distribution with mean μ i j . When we assumed that there was no interaction, we used the additive model

μ i j = μ + α i + β j

So, for example, the difference μ i 1 μ i 2 between the means at levels 1 and 2 of factor B is equal to β 1 β 2 and does not depend upon the level of factor A . When we allow interaction, this is not necessarily true and we write

μ i j = μ + α i + β j + γ i j

Here γ i j is an interaction effect . Now μ i 1 μ i 2 = β 1 β 2 + γ i 1 γ i 2 so the difference between two levels of factor B depends on the level of factor A .

2.1 Fixed and random effects

Often the levels assigned to a factor will be chosen deliberately. In this case the factors are said to be fixed and we have a fixed effects model . If the levels are chosen at random from a population of all possible levels, the factors are said to be random and we have a random effects model . Sometimes one factor may be fixed while one may be random . In this case we have a mixed effects model . In effect, we are asking whether we are interested in certain particular levels of a factor (fixed effects) or whether we just regard the levels as a sample and are interested in the population in general (random effects).

Calculation method

The data you will be working with will be set out in a manner similar to that shown below.

The table assumes n observations per cell and is shown along with a variety of totals and means which will be used in the calculations of the various test statistics to follow.

Factor B
Factor A Level 1 Level 2 Level j Level b Totals
Level 1 x 111 x 11 n x 121 x 12 n x 1 j 1 x 1 j n x 1 b 1 x 1 b n T 1
Level 2 x 211 x 21 n x 221 x 22 n x 2 j 1 x 2 j n x 2 b 1 x 2 b n T 2
Level i x i 11 x i 1 n
Sum of data in cell ( i , j ) is T i j = k = 1 n x i j k
x i j 1 x i j n x i b 1 x i b n T i
Level a x a 11 x a 1 n x a 21 x a 2 n x a j 1 x a j n x a b 1 x a b n T a
Totals T 1 T 2 T j T b T

Notes

  1. T . . . represents the grand total of the data values so that

    T = j = 1 b T j = i = 1 a T i = i = 1 a j = 1 b k = 1 n x i j k

  2. T i . . represents the total of the data in the i th row.
  3. T . j . represents the total of the data in the j th column.
  4. The total number of data entries is given by N = n a b .

Partitioning the variation

We are now in a position to consider the partition of the total sum of the squared deviations from the overall mean which we estimate as

No alt text was set. Please request alt text from the person who provided you with this resource.

The total sum of the squared deviations is

No alt text was set. Please request alt text from the person who provided you with this resource.

and it can be shown that this quantity can be written as

S S T = S S A + S S B + S S A B + S S E

where S S T is the total sum of squares given by

S S T = i = 1 a j = 1 b k = 1 n x i j k 2 T 2 N ;

S S A is the sum of squares due to variations caused by factor A given by

S S A = i = 1 a T i 2 b n T 2 N

S S B is the sum of squares due to variations caused by factor B given by

S S B = j = 1 b T j 2 a n T 2 N

Note that b n means b × n which is the number of observations at each level of A and a n means a × n which is the number of observations at each level of B .

S S A B is the sum of the squares due to variations caused by the interaction of factors A and B and is given by

S S A B = i = 1 a j = 1 b T i j 2 n T 2 N S S A S S B .

Note that the quantity T i j . = k = 1 n x i j k is the sum of the data in the ( i , j ) th cell and that the quantity i = 1 a j = 1 b T i j . 2 n T . . . 2 N is the sum of the squares between cells.

S S E is the sum of the squares due to chance or experimental error and is given by

S S E = S S T S S A S S B S S A B

The number of degrees of freedom ( N 1 ) is partitioned as follows:

S S T S S A S S B S S A B S S E
N 1 a 1 b 1 ( a 1 ) ( b 1 ) N a b

Note that there are a b 1 degrees of freedom between cells and that the number of degrees of freedom for S S A B is given by

a b 1 ( a 1 ) ( b 1 ) = ( a 1 ) ( b 1 )

This gives rise to the following two-way ANOVA tables.

Two-Way ANOVA Table - Fixed-Effects Model

Source of Variation Sum of squares SS Degrees of Freedom Mean Square MS Value of F Ratio
Factor A S S A ( a 1 ) M S A = S S A ( a 1 ) F = M S A M S E
Factor B S S B ( b 1 ) M S B = S S B ( b 1 ) F = M S B M S E
Interaction S S A B ( a 1 ) × ( b 1 ) M S A B = S S A B ( a 1 ) ( b 1 ) F = M S A B M S E
Residual Error S S E ( N a b ) M S E = S S E N a b
Totals S S T ( N 1 )

Two-Way ANOVA Table - Random-Effects Model

Source of Variation Sum of squares SS Degrees of Freedom Mean Square MS Value of F Ratio
Factor A S S A ( a 1 ) M S A = S S A ( a 1 ) F = M S A M S A B
Factor B S S B ( b 1 ) M S B = S S B ( b 1 ) F = M S B M S A B
Interaction S S A B ( a 1 ) × ( b 1 ) M S A B = S S A B ( a 1 ) ( b 1 ) F = M S A B M S E
Residual Error S S E ( N a b ) M S E = S S E N a b
Totals S S T ( N 1 )

Two-Way ANOVA Table - Mixed-Effects Model

Case (i) A fixed and B random.

Source of Variation Sum of squares SS Degrees of Freedom Mean Square MS Value of F Ratio
Factor A S S A ( a 1 ) M S A = S S A ( a 1 ) F = M S A M S A B
Factor B S S B ( b 1 ) M S B = S S B ( b 1 ) F = M S B M S E
Interaction S S A B ( a 1 ) × ( b 1 ) M S A B = S S A B ( a 1 ) ( b 1 ) F = M S A B M S E
Residual Error S S E ( N a b ) M S E = S S E N a b
Totals S S T ( N 1 )

Case (ii) A random and B fixed.

Source of Variation Sum of squares SS Degrees of Freedom Mean Square MS Value of F Ratio
Factor A S S A ( a 1 ) M S A = S S A ( a 1 ) F = M S A M S E
Factor B S S B ( b 1 ) M S B = S S B ( b 1 ) F = M S B M S A B
Interaction S S A B ( a 1 ) × ( b 1 ) M S A B = S S A B ( a 1 ) ( b 1 ) F = M S A B M S E
Residual Error S S E ( N a b ) M S E = S S E N a b
Totals S S T ( N 1 )
Example 1

In an experiment to compare the effects of weathering on paint of three different types, two identical surfaces coated with each type of paint were exposed in each of four environments. Measurements of the degree of deterioration were made as follows.

Environment 1
Environment 2
Environment 3
Environment 4
Paint A 10.89 10.74 9.94 11.25 9.88 10.13 14.11 12.84
Paint B 12.28 13.11 14.45 11.17 11.29 11.10 13.44 11.37
Paint C 10.68 10.30 10.89 10.97 10.61 11.00 12.22 11.32

Making the assumptions of normality, independence and equal variance, derive the appropriate ANOVA tables and state the conclusions which may be drawn at the 5% level of significance in the following cases.

  1. The types of paint and the environments are chosen deliberately because the interest is in these paints and these environments.
  2. The types of paint are chosen deliberately because the interest is in these paints but the environments are regarded as a sample of possible environments.
  3. The types of paint are regarded as a random sample of possible paints and the environments are regarded as a sample of possible environments.
Solution

We know that

Case 1.
is described as a fixed-effects model,
Case 2.
is described as a mixed-effects model (paint type fixed) and
Case 3.
is described as a random-effects model.

In all three cases the calculations necessary to find M S P (paints), M S N (environments), M S P and M S N are identical. Only the calculation and interpretation of the test statistics will be different. The calculations are shown below.

Subtracting 10 from each observation, the data become:

No alt text was set. Please request alt text from the person who provided you with this resource.

The total sum of squares is

S S T = 0.8 9 2 + 0.7 4 2 + + 1.3 2 2 35.9 8 2 24 = 36.910

We can simplify the calculation by finding the between samples sum of squares

S S S = 1 2 ( 1.6 3 2 + 5.3 9 2 + + 3.5 4 2 ) 35.9 8 2 24 = 26.762

Sum of squares for paints is

S S P = 1 8 ( 9.7 8 2 + 18.1 5 2 + 7.9 9 2 ) 35.9 8 2 24 = 7.447

Sum of squares for environments is

S S N = 1 6 ( 8.0 0 2 + 8.6 7 2 + 3.9 8 2 + 15.3 0 2 ) 35.9 8 2 24 = 10.950

So the interaction sum of squares is S S P N = S S S S S P S S N = 8.365 and

the residual sum of squares is S S E = S S T S S S = 10.148 The results are combined in the following ANOVA table

No alt text was set. Please request alt text from the person who provided you with this resource.

The following conclusions may be drawn. There is insufficient evidence to support the interaction hypothesis in any case. Therefore we can look at the tests for the main effects.

Case 1.
Since 4.40 > 3.89 we have sufficient evidence to conclude that paint type affects the degree of deterioration. Since 4.07 > 3.49 we have sufficient evidence to conclude that environment affects the degree of deterioration.
Case 2.
Since 2.67 < 5.14 we do not have sufficient evidence to reject the hypothesis that paint type has no effect on the degree of deterioration. Since 4.07 > 3.49 we have sufficient evidence to conclude that environment affects the degree of deterioration.
Case 3.
Since 2.67 < 5.14 we do not have sufficient evidence to reject the hypothesis that paint type has no effect on the degree of deterioration. Since 2.61 < 4.76 we do not have sufficient evidence to reject the hypothesis that environment has no effect on the degree of deterioration.

If the test for interaction had given a significant result then we would have concluded that there was an interaction effect. Therefore the differences between the average degree of deterioration for different paint types would have depended on the environment and there might have been no overall ‘best paint type’. We would have needed to compare combinations of paint types and environments. However the relative sizes of the mean squares would have helped to indicate which effects were most important.

Task!

A motor company wishes to check the influences of tyre type and shock absorber settings on the roadholding of one of its cars. Two types of tyre are selected from the tyre manufacturer who normally provides tyres for the company’s new vehicles. A shock absorber with three possible settings is chosen from a range of shock absorbers deemed to be suitable for the car. An experiment is conducted by conducting roadholding tests using each tyre type and shock absorber setting. The (coded) data resulting from the experiment are given below.

Factor
Shock Absorber Setting
Tyre B1=Comfort B2=Normal B3=Sport
5 8 6
Type A1 6 5 9
8 3 12
9 10 12
Type A2 7 9 10
7 8 9

Decide whether an appropriate model has random-effects, mixed-effects or fixed-effects and derive the appropriate ANOVA table. State clearly any conclusions that may be drawn at the 5% level of significance.

We know that both the tyres and the shock absorbers are not chosen at random from populations consisting of all possible tyre types and shock absorber types so that their influence is described by a fixed-effects model. The calculations necessary to find M S A , M S B , M S A B and M S E are shown below.

B1 B2 B3 Totals
5 8 6
A1 6 5 9
8 3 12
T 11 = 19 T 12 = 16 T 13 = 27 T 1 = 62
9 10 12
A2 7 9 10
7 8 9
T 21 = 23 T 22 = 27 T 23 = 31 T 2 = 81
Totals T 1 = 42 T 2 = 43 T 3 = 58 T = 143

The sums of squares calculations are:

S S T = i = 1 2 j = 1 3 k = 1 3 x i j k 2 T 2 N = 5 2 + 6 2 + + 1 0 2 + 9 2 14 3 2 18 = 1233 14 3 2 18 = 96.944

S S A = i = 1 2 T i 2 b n T 2 N = 6 2 2 + 8 1 2 3 × 3 14 3 2 18 = 10405 9 14 3 2 18 = 20.056

S S B = j = 1 3 T j 2 a n T 2 N = 4 2 2 + 4 3 2 + 5 8 2 2 × 3 14 3 2 18 = 6977 6 14 3 2 18 = 26.778

S S A B = i = 1 2 j = 1 3 T i j 2 n T 2 N S S A S S B = 1 9 2 + + 3 1 2 3 14 3 2 18 20.056 26.778

= 3565 3 14 3 2 18 20.056 26.778 = 5.444

S S E = S S T S S A S S B S S A B = 96.944 20.056 26.778 5.444 = 44.666

The results are combined in the following ANOVA table.

Source SS DoF M S F (Fixed) F (Fixed)
Factor 20.056 1 20.056 M S A M S E 5.39
A F 1 , 12 = 4.75
Factor 26.778 2 13.389 M S B M S E 3.60
B F 2 , 12 = 3.89
Interaction 5.444 2 2.722 M S A B M S E 0.731
A B F 2 , 12 = 3.89
Residual 44.666 12 3.722
E
Totals 96.944 17

The following conclusions may be drawn:

Interaction: There is insufficient evidence to support the hypothesis that interaction takes place between the factors.

Factor A : Since 5.39 > 4.75 we have sufficient evidence to reject the hypothesis that tyre type does not affect the roadholding of the car.

Factor B : Since 3.60 < 3.89 we do not have sufficient evidence to reject the hypothesis that shock absorber settings do not affect the roadholding of the car.

Task!

The variability of a measured characteristic of an electronic assembly is a source of trouble for a manufacturer with global manufacturing and sales facilities. To investigate the possible influences of assembly machines and testing stations on the characteristic, an engineer chooses three testing stations and three assembly machines from the large number of stations and machines in the possession of the company. For each testing station - assembly machine combination, three observations of the characteristic are made.

The (coded) data resulting from the experiment are given below.

Factor
Testing Station
Assembly Machine   B1     B2     B3  
2.3 3.7 3.1
A1 3.4 2.8 3.2
3.5 3.7 3.5
3.5 3.9 3.3
A2 2.6 3.9 3.4
3.6 3.4 3.5
2.4 3.5 2.6
A3 2.7 3.2 2.6
2.8 3.5 2.5

Decide whether an appropriate model has random-effects, mixed-effects or fixed-effects and derive the appropriate ANOVA table.

State clearly any conclusions that may be drawn at the 5% level of significance.

Both the machines and the testing stations are effectively chosen at random from populations consisting of all possible types so that their influence is described by a random-effects model. The calculations necessary to find M S A , M S B , M S A B and M S E are shown below.

B1 B2 B3 Totals
2.3 3.7 3.1
A1 3.4 2.8 3.2
3.5 3.7 3.5
T 11 = 9.2 T 12 = 10.2 T 13 = 9.8 T 1 = 29.2
3.5 3.9 3.3
A2 2.6 3.9 3.4
3.6 3.4 3.5
T 21 = 9.7 T 22 = 11.2 T 23 = 10.2 T 2 = 31.1
2.4 3.5 2.6
A3 2.7 3.2 2.6
2.8 3.5 2.5
T 31 = 7.9 T 32 = 10.2 T 33 = 7.7 T 3 = 25.8
Totals T 1 = 26.8 T 2 = 31.6 T 3 = 27.7 T = 86.1

a = 3 , b = 3 , n = 3 , N = 27 and the sums of squares calculations are:

S S T = i = 1 3 j = 1 3 k = 1 3 x i j k 2 T 2 N = 2 . 3 2 + 3 . 4 2 + + 2 . 6 2 + 2 . 5 2 86 . 1 2 27 = 5.907

S S A = i = 1 3 T i 2 b n T 2 N = 29 . 2 2 + 31 . 1 2 + 25 . 8 2 3 × 3 86 . 1 2 27 = 1.602

S S B = j = 1 3 T j 2 a n T 2 N = 26 . 8 2 + 31 . 6 2 + 27 . 7 2 3 × 3 86 . 1 2 27 = 1.447

S S A B = i = 1 3 j = 1 3 T i j 2 n T 2 N S S A S S B

= 9 . 2 2 + 10 . 2 2 + + 10 . 2 2 + 7 . 7 2 3 86 . 1 2 27 1.602 1.447 = 0.398

S S E = S S T S S A S S B S S A B = 5.907 1.602 1.447 0.398 = 2.46

The results are combined in the following ANOVA table

Source S S D o F M S F (Random) F (Random)
Factor 1.602 2 0.801 M S A M S A B 8.05
A F 2 , 4 = 6.94
(Machines)
Factor 1.447 2 0.724 M S B M S A B 7.28
B F 2 , 4 = 6.94
(Stations)
Interaction 0.398 4 0.099(5) M S A B M S E 0.728
A B F 4 , 18 = 2.93
Residual 2.460 18 0.136
E
Totals 5.907 26

The following conclusions may be drawn.

Interaction: There is insufficient evidence to support the hypothesis that interaction takes place between the factors.

Factor A : Since 8.05 > 6.94 we have sufficient evidence to reject the hypothesis that the assembly machines do not affect the assembly characteristic.

Factor B : Since 7.28 > 6.94 we have sufficient evidence to reject the hypothesis that the choice of testing station does not affect the assembly characteristic.