This tutorial is going to take what we learned in one-way ANOVA and extend it to two-way ANOVA. In a one-way ANOVA, we have

- A single dependent variable measured on an interval scale
- A single independent variable measured on a nominal scale

The example used in our one-way ANOVA tutorial was the cook times of four different brands of pasta. The brand of pasta was the independent variable, and the cook time (in minutes) was the dependent variable. We then compared the groups to determine whether the cook times were significantly different.

We can easily extend this to the case where we have *more than one* nominal independent variable. With two **factors**, we can do more than just compare the means between factor 1 and factor 2. We can also explore possible **interactions** between our two independent variables. An interaction means that the size of the effect of one variable depends on the values of another variable.

Here’s an example. We want to study the effect of both a new drug and exercise at reducing chronic pain.

We can look at two *main effects* as well as the interaction effect. That is, we can ask:

- Is reported pain lower for Drug A than Drug B?
- Is reported pain lower for those who exercise versus those who do not?
- Is the effect of Drug A versus Drug B
*larger*when subjects exercise?

A **main effect** tests whether means are significantly different between levels of one factor assuming the interaction effect is zero. For example, this would test whether the means are significantly different in one treatment level than the other(s) on average. If there is no interaction, the difference will be the same regardless of the level of the other factor.

However, if there *is* an interaction, the difference in means between treatment levels will be *different* depending on the level of the other factor. The plots of means will *not* be parallel. The difference between levels of factor A are different depending on which line - defined by factor B - you are looking at.

Interactions are interpreted as a *difference in differences of means.* Stated differently, they are interpretated as saying the effect of one treatment is context specific.

When we have two factors, but we do not care about the interaction, we say we have a **two-way ANOVA**. However, if we are interested in the interaction, we say we have a **two-way factorial ANOVA**. If we added a third factor, we would have a three-way ANOVA. When we test the interaction between all factors, including three-way interactions, we say our ANOVA is *fully factorial*.

Our drug and exercise example is a 2x3 design:

Drug A | Drug B | Drug C | ||
---|---|---|---|---|

No Exercise | \(\mu_{11}\) | \(\mu_{12}\) | \(\mu_{13}\) | \(\mu_{1.}\) |

Exercise | \(\mu_{21}\) | \(\mu_{22}\) | \(\mu_{23}\) | \(\mu_{2.}\) |

Total | \(\mu_{.1}\) | \(\mu_{.2}\) | \(\mu_{.3}\) | \(\mu_{..}\) |

## Analyzing two-way Factorial ANOVA

We will now have a separate F test for each component of the design we want to test. Our results table will thus have three different F statistics.

- The main effect of Factor A
- The main effect of Factor B
- The interaction between A and B

Recall that

\[ F = \frac{\text{Variance between Groups}}{\text{Variance within Groups}} = \frac{MS_b}{MS_w} \]

We need to compute the appropriate sum of squares and degrees of freedom for each test.

- \(F_A = \frac{MS_A}{MS_w}\)
- \(F_B = \frac{MS_B}{MS_w}\)
- \(F_{AB} = \frac{MS_{AB}}{MS_w}\)

Note that \(MS_w\) is the same for each test. In the case of a 2x3 factorial ANOVA, we subtract the mean from the corresponding *cell*. For observation *i* in level *j* of the first factor and level *k* of the second, we need to square:

\[ x_{ijk} - \bar{x}_{jk} \]

Our *sum of squares within* is therefore:

\[ SS_w = \sum_k \sum_j \sum_i (x_{ijk} - \bar{x}_{jk})^2 \]

The triple summation means:

- Sum within each cell, then
- sum the totals within each row, then
- sum across the total rows

Thus, to get \(SS_w\), you will first need to get the means for each combination of treatment.

It is easiest to write the remaining three sums of squares assuming equal \(n\)s in each cell. The (between) sum of squares for Factor A is:

\[ SS_A = nK \sum_j (\bar{x}_{j.} - \bar{\bar{x}})^2 \]

The numbers are based on the following:

A Level 1 | A Level 2 | A Level 3 | Total | |
---|---|---|---|---|

B Level 1 | \(\bar{x}_{11}\) | \(\bar{x}_{12}\) | \(\bar{x}_{13}\) | \(\bar{x}_{j.}\) |

B Level 2 | \(\bar{x}_{21}\) | \(\bar{x}_{22}\) | \(\bar{x}_{23}\) | \(\bar{x}_{j.}\) |

Total | \(\bar{x}_{.k}\) | \(\bar{x}_{.k}\) | \(\bar{x}_{.k}\) | \(\bar{\bar{x}}\) |

The means in the bottom row and last column are called the *marginals*. Subtract the grand mean from each of the marginal means. Weight by \(nK\), which is the \(n\) from each cell times the \(K\) columns. The sum of squares for Factor B is similar:

\[ SS_B = nJ \sum_k (\bar{x}_{k.} - \bar{\bar{x}})^2 \]

We now use the column marginals and weight by \(nJ\), the number in each cell times the number of rows.

One way of thinking about an interaction is that it reflects a situation when a change in mean is greater than the sum of its parts. This is because the interaction implies that the effect of a variable is greater (or lesser) in the presence of a specific level of the other variable. This extra boost above and beyond what the drug does by itself is the interaction.

Thus, the sum of squares for the interaction is any cell mean deviations from the grand mean after taking into account:

- Row mean deviations from the grand mean (Factor A sum of squares).
- Column mean deviations from the grand mean (Factor B sum of squares).

This leads to:

\[ SS_{AB} = \sum_k \sum_j n[(\bar{x}_{jk} - \bar{\bar{x}}) - (\bar{x}_{j.} - \bar{\bar{x}}) - (\bar{x}_{k.} - \bar{\bar{x}})]^2 \\ = \sum_k \sum_j n(\bar{x}_{jk} - \bar{x}_{j.} - \bar{x}_{k.} + \bar{\bar{x}})^2 \]

We also need the appropriate degrees of freedom to get the Mean Squares used to obtain F.

- \(df_w = N - JK\)
- \(df_A = J - 1\)
- \(df_B = K - 1\)
- \(df_{AB} = (J - 1)(K - 1)\)

Divide the appropriate \(SS\) by the respective \(df\) to get the respective \(MS\). Use these to form the \(F\) ratio. Then, compare it to an \(F\) distribution with \(df_1\) equal to the numerator degrees of freedom, and \(df_2\) equal to the denominator degrees of freedom.

## Fully Worked Example

We are interested in exploring the effectiveness of three drugs on chronic pain:

- Drug A: An existing over-the-counter pain killer.
- Drug B: An existing prescription pain killer.
- Drug C: Medical Marijuana.

We are also interested in exploring whether physical activity affects outcomes or interacts with the drug. Our two treatment levels are:

- No Exercise.
- Exercise.

Our sample size is \(N = 60\). First, let’s inspect the data:

Pot seems to have the lowest pain scores, followed by prescription, then OTC. Prescription and Pot are approximately centered around the median, but OTC has a negative skew. Exercise has a lower median score than no exercise, but the variances are about the same length.

These boxplots do not tell us if there is an interaction. Use an *interaction plot* to assess the possibility that the drug differences depend on levels of exercise.

The two lines for Exercise vs. No Exercise are not parallel, so this implies that there is an interaction effect present. The difference between OTC and Prescription is larger for those who do not exercise versus those who do. We can also look at an interaction plot with exercise on the x-axis and separate lines for drug.

This plot confirms the presence of an interaction effect. It also shows that there is a larger difference between Prescription and OTC, as well as Prescription and Pot, for those who do not exercise versus those who do.

This is a 3 x 2 Factorial design.

We need to come up with the appropriate sums of squares and degrees of freedom for each test.

- \(F_A = \frac{MS_A}{MS_w}\)
- \(F_B = \frac{MS_B}{MS_w}\)
- \(F_{AB} = \frac{MS_{AB}}{MS_w}\)

In our data, there are exactly 10 participants in each cell. Cell means are:

Exercise | OTC | Prescription | Pot | Overall |
---|---|---|---|---|

Exercise | 5.4 | 5.4 | 3.1 | 4.633 |

No Exercise | 7.8 | 5.8 | 4.7 | 6.100 |

Overall | 6.6 | 5.6 | 3.9 | 5.370 |

First, get:

\[ SS_A = nK \sum_j (\bar{x}_{j.}-\bar{\bar{x}})^2 \\ = 10 * 2*((6.6-5.37)^2 +(5.6-5.37)^2 +(3.9-5.37)^2) \\ = 74.53 \]

Next, get \(SS_B\):

\[ SS_B = nJ \sum_k (\bar{x}_{k.}-\bar{\bar{x}})^2 \\ = 10 * 3 * ((4.633 - 5.37)^2 + (6.1 - 5.37)^2) \\ = 32.28 \]

Now get the sum of squares for the interaction:

\[ SS_{AB} = \sum_k \sum_j n (\bar{x}_{jk} - \bar{x}_{j.}-\bar{x}_{k.}+\bar{\bar{x}})^2 \\ = 10 * ((5.4 - 6.6 - 4.633 + 5.37)^2 + \\ (5.4 - 5.6 - 4.633 + 5.37)^2 + \\ (3.1 - 3.9 - 4.633 + 5.37)^2 + \\ (7.8 - 6.6 - 6.1 + 5.37)^2 + \\ (5.8 - 5.6 - 6.1 +5.37)^2 + \\ (4.7 - 3.9 - 6.1 + 5.37)^2) \\ = 10.134 \]

To compute the Mean Squares, we need the appropriate degrees of freedom. (Recall \(N =60\)).

- \(df_W = N - JK = 60 -(3)(2) = 54\)
- \(df_A = J - 1 = 3 - 1 = 2\)
- \(df_B = K - 1 = 2 - 1 = 1\)
- \(df_{AB} = (J - 1)(K - 1) = (3 -1)(2-1) = 2\)

Use these to get our F statistics:

\[ F = \frac{MS_{A, B, or AB}}{MS_W} \]

- \(F_A = \frac{MS_A}{MS_W} = \frac{74.53/2}{383/54} = 5.25\)
- \(F_B = \frac{MS_B}{MS_W}= \frac{32.28/1}{383/54} = 4.55\)
- \(F_{AB} = \frac{MS_{AB}}{MS_W}= \frac{10.13/2}{383/54}= 0.71\)

To determine if any of these effects are significant, compare to the appropriate \(F\) distribution.

*Is the interaction significant?*

- \(F_{AB} = \frac{MS_{AB}}{MS_W}=0.71\)
- \(df_1 = 1, df_2 = 4\)
- Critical \(F\) is 7.708

No, we cannot reject the null hypothesis of no interaction effect. The effect of drug is not dependent on the level of exercise. Also, the effect of exercise is not dependent on the level of drug.

*Is the main effect of drug significant?*

- \(F_A = \frac{MS_A}{MS_W} = 5.25\)
- \(df_1 = 2, df_2 = 54\)
- Critical \(F\) is 3.17

Yes, reject the null of no drug main effect.

*Is the main effect of exercise significant?*

- \(F_B = \frac{MS_B}{MS_W} = 4.55\)
- \(df_1 = 1, df_2 = 54\)
- Critical \(F\) is 4.02

Yes, reject the null of no exercise main effect.

Since the interaction is *not* significant, we can carry out pairwise comparisons of marginal means within the main effect families. With just two exercise levels, we do not need to adjust for multiple comparisons.

Difference | Lower Interval | Upper Interval | p-adj | |
---|---|---|---|---|

No Exercise - Exercise = 0 | 1.5 | 0.10 | 2.84 | 0.04 |

These results indicate that there is a significant difference between *exercise* and *no exercise* at the \(\alpha = 0.05\) level. There are three levels of the drug factor, so the pairwise comparisons require an adjustment for multiple tests. The following shows the results using Tukey’s HSD test.

Difference | Lower Interval | Upper Interval | p-adj | |
---|---|---|---|---|

Prescription - OTC = 0 | -1.0 | -3.02 | 1.02 | 0.46 |

Pot - OTC = 0 | -2.7 | -4.72 | -0.68 | 0.01 |

Pot - Prescription = 0 | -1.7 | -3.72 | 0.32 | 0.11 |

There is also a significant difference between between *Pot* and *OTC* at the \(\alpha = 0.05\) level.

Had the interaction been significant, we would have had to test for the significance of Drug within each level of Exercise. Then, where significant, carry out pairwise comparisons with Bonferroni adjustments.