Two-way ANOVA versus one-way ANOVA

3 Two-way ANOVA versus one-way ANOVA

You should note that a two-way ANOVA design is rather more efficient than a one-way design. In the last example, we could fix the testing station and look at the electronic assemblies produced by a variety of machines. We would have to replicate such an experiment for every testing station. It would be very difficult (impossible!) to exactly duplicate the same conditions for all of the experiments. This implies that the consequent experimental error could be very large. Remember also that in a one-way design we cannot check for interaction between the factors involved in the experiment. The three main advantages of a two-way ANOVA may be stated as follows:

It is possible to simultaneously test the effects of two factors. This saves both time and money.
It is possible to determine the level of interaction present between the factors involved.
The effect of one factor can be investigated over a variety of levels of another and so any conclusions reached may be applicable over a range of situations rather than a single situation.

Exercises

The temperatures, in Celsius, at three locations in the engine of a vehicle are measured after each of five test runs. The data are as follows. Making the usual assumptions for a two-way analysis of variance without replication, test the hypothesis that there is no systematic difference in temperatures between the three locations. Use the 5% level of significance.

Location	Run 1	Run 2	Run 3	Run 4	Run 5
A	72.8	77.3	82.9	69.4	74.6
B	71.5	72.4	80.7	67.0	74.0
C	70.8	74.0	79.1	69.0	75.4

Waste cooling water from a large engineering works is filtered before being released into the environment. Three separate discharge pipes are used, each with its own filter. Five samples of water are taken on each of four days from each of the three discharge pipes and the concentrations of a pollutant, in parts per million, are measured. The data are given below. Analyse the data to test for differences between the discharge pipes. Allow for effects due to pipes and days and for an interaction effect. Treat the pipe effects as fixed and the day effects as random. Use the 5% level of significance.

Day	Pipe A
1	160	181	163	173	178
2	175	170	219	166	171
3	169	186	179	178	183
4	230	206	216	195	250
Day	Pipe B
1	172	164	186	185	172
2	177	170	156	140	155
3	193	194	189	156	181
4	212	235	195	206	209
Day	Pipe C
1	214	196	207	219	200
2	186	184	181	189	179
3	209	220	199	185	228
4	254	293	283	262	259

We calculate totals as follows.

Run	Total	Location	Total
1	215.1	A	377.0
2	223.7	B	365.6
3	242.7	C	368.3
4	205.4	Total	1110.9
5	224.0
Total	1110.9

$\sum \sum y_{i j}^{2} = 82552.17$

The total sum of squares is

$8255217 - \frac{1110 . 9^{2}}{15} = 278.916$ on $15 - 1 = 14$ degrees of freedom.

The between-runs sum of squares is

$\frac{1}{3} (215 . 1^{2} + 223 . 7^{2} + 242 . 7^{2} + 205 . 4^{2} + 224 . 0^{2}) - \frac{1110 . 9^{2}}{15} = 252.796$

on $5 - 1 = 4$ degrees of freedom.

The between-locations sum of squares is

$\frac{1}{5} (377 . 0^{2} + 365 . 6^{2} + 368 . 3^{2}) - \frac{1110 . 9^{2}}{15} = 14.196$ on $3 - 1 = 2$ degrees of freedom.

By subtraction, the residual sum of squares is

$278.916 - 252.796 - 14.196 = 11.924$ on $14 - 4 - 2 = 8$ degrees of freedom.

The analysis of variance table is as follows.

Source of variation	Sum of squares	Degrees of freedom	Mean square	Variance ratio
Runs	252.796	4	63.199
Locations	14.196	2	7.098	4.762
Residual	11.924	8	1.491
Total	278.916	14

The upper 5% point of the $F_{2, 8}$ distribution is 4.46. The observed variance ratio is greater than this so we conclude that the result is significant at the 5% level and reject the null hypothesis at this level. The evidence suggests that there are systematic differences between the temperatures at the three locations. Note that the Runs mean square is large compared to the Residual mean square showing that it was useful to allow for differences between runs.

We calculate totals as follows.

	Day 1	Day 2	Day 3	Day 4	Total
Pipe A	855	901	895	1097	3748
Pipe B	879	798	913	1057	3647
Pipe C	1036	919	1041	1351	4347
Total	2770	2618	2849	3505	11742

$\sum \sum \sum y_{i j k}^{2} = 2356870$

The total number of observations is $N = 60.$

The total sum of squares is

$2356870 - \frac{1174 2^{2}}{60} = 58960.6$

on $60 - 1 = 59$ degrees of freedom.

The between-cells sum of squares is

$\frac{1}{5} (85 5^{2} + \dots + 135 1^{2}) - \frac{1174 2^{2}}{60} = 58960.6$

on $12 - 1 = 11$ degrees of freedom, where by “cell” we mean the combination of a pipe and a day.

By subtraction, the residual sum of squares is

$58960.6 - 48943.0 = 10017.6$

on $59 - 11 = 48$ degrees of freedom.

The between-days sum of squares is

$\frac{1}{15} (277 0^{2} + 261 8^{2} + 284 9^{2} + 350 5^{2}) - \frac{1174 2^{2}}{60} = 30667.3$

on $4 - 1 = 3$ degrees of freedom.

The between-pipes sum of squares is

$\frac{1}{20} (374 8^{2} + 364 7^{2} + 434 7^{2}) - \frac{1174 2^{2}}{60} = 14316.7$

on $3 - 1 = 2$ degrees of freedom.

By subtraction the interaction sum of squares is

$48943.0 - 30667.3 - 14316.7 = 3959.0$

on $11 - 3 - 2 = 6$ degrees of freedom.

The analysis of variance table is as follows.

Source of variation	Sum of squares	Degrees of freedom	Mean square	Variance ratio
Pipes	14316.7	2	7158.4	10.85
Days	30667.3	3	10222.4	48.98
Interaction	3959.0	6	659.8	3.16
Cells	48943.0	11	4449.4	21.32
Residual	10017.6	48	208.7
Total	58960.6	59

Notice that, because Days are treated as a random effect, we divide the Pipes mean square by the Interaction mean square rather than by the Residual mean square.

The upper 5% point of the $F_{6, 48}$ distribution is approximately 2.3. Thus the Interaction variance ratio is significant at the 5% level and we reject the null hypothesis of no interaction. We must therefore conclude that there are differences between the means for pipes and for days and that the difference between one pipe and another varies from day to day. Looking at the mean squares, however, we see that both the Pipes and Days mean squares are much bigger than the Interaction mean square. Therefore it seems that the interaction effect is relatively small compared to the differences between days and between pipes.

3 Two-way ANOVA versus one-way ANOVA

Exercises

Answer