Conducting ANOVA in R
In the previous section, we went over what ANOVA is and how to do it by hand. Now we will go over how to do it using r. We will be using a different dataset than the pervious example, which can be found here:
data <- read_excel("data/ANOVA Lab 1.xlsx")
We want to study the effectiveness of different treatments on anxiety. We collect a sample of 75 subjects in the following categories:
- No treatment (\(n_1\) = 27).
- Biofeedback (\(n_2\) = 24).
- Cognitive-behavioral Treatment (\(n_3\) = 24).
The dependent variable is anxiety levels.
- \(H_0\) : \(\mu_1 = \mu_2 = \mu_3\)
- \(H_A\) : At least two \(\mu_i\)’s are different
If even two of the means are significantly different, we reject the null hypothesis.
We can visualize the data with bar graphs to check for normality and equality of variances across groups:
data %>% ggplot(aes(x = treatment, group = treatment, y = anxiety)) + geom_boxplot() + xlab("Treatment") + ylab("Anxiety")
Normality looks good. The equality of variances assumption appears suspect (this can be tested). For now, let’s do our ANOVA.
Steps to Doing an ANOVA
The steps to doing an ANOVA in r are as follows:
- Assert the null hypothesis: all means are equal.
- Visualize the data.
- Use the R function
aov()to compute the analysis of variance.
- View the summary of the analysis.
- Compare to F distribution with \(df_B\) and \(df_W\).
- An F in the tail of the distribution means reject the null hypothesis.
We can conduct our ANOVA and view the results below:
# Compute ANOVA data.aov <- aov(data = data, anxiety ~ as.factor(treatment)) # View Summary summary(data.aov)
## Df Sum Sq Mean Sq F value Pr(>F) ## as.factor(treatment) 2 1346 673 8.746 0.000398 *** ## Residuals 72 5541 77 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We find the \(F\) statistic to be 8.746. Compare to an \(F\) distribution with \(df_1\) = 2 and \(df_2\) = 72. The cut-off for significance given these \(df\) is 3.124, so we have a significant result.
xmin <- 0 xmax <- 10 x = seq(xmin,xmax, by = .01) y = df(x, df1 = 2, df2 = 72) f_data <- tibble(x = x, y = y) xbar1 <- 3.124 subset_x1 <- x[x > xbar1] poly_data1 <- tibble(x_poly = c(xbar1, subset_x1, xmax), y_poly = c(0, df(subset_x1, df1 = 2, df2 = 72), 0)) ggplot(f_data, aes(x = x, y = y)) + geom_line() + geom_polygon(data = poly_data1, aes(x = x_poly, y = y_poly), fill = "firebrick", color = "black") + labs(x = "F", y = "") + geom_segment(aes(x = xbar1, y = .01, xend = xbar1, yend = .2)) + geom_text(x = xbar1, y = .25, label = "3.124") + geom_segment(aes(x = 8.746, y = .01, xend = 8.746, yend = .05)) + geom_text(x = 8.746, y = .12, label = "8.746")
We can reject the null hypothesis that all \(\mu_i\)’s are equal.
Contrasts in R
So we’ve found F to be significant. Great! One problem: We do not know which of the differences are significant.
- Is biofeedback significantly different from the control?
- Is CBT significantly different from the control?
- Is biofeedback significantly different from CBT?
We need to dig a little deeper and make possibly multiple pairwise comparisons. We will start with a pairwise t-test.
pairwise.t.test(data$anxiety, data$treatment, p.adjust.method = "BH")
## ## Pairwise comparisons using t tests with pooled SD ## ## data: data$anxiety and data$treatment ## ## 1 2 ## 2 0.08678 - ## 3 0.00025 0.03076 ## ## P value adjustment method: BH
We get a table of p-values adjusted using the Benjamin-Hochberg method. Assuming an \(\alpha\) level of 0.05, \(\mu_1\) to \(\mu_3\) are significantly different.
We can also conduct multiple pairwise comparisons using the Tukey method.
## Tukey multiple comparisons of means ## 95% family-wise confidence level ## ## Fit: aov(formula = anxiety ~ as.factor(treatment), data = data) ## ## $`as.factor(treatment)` ## diff lwr upr p adj ## 2-1 -4.273148 -10.16263 1.61633472 0.1988113 ## 3-1 -10.273148 -16.16263 -4.38366528 0.0002413 ## 3-2 -6.000000 -12.06023 0.06022788 0.0529002
diff gives the difference between the means of the two groups,
upr give the lower and upper bound for the confidence interval of the difference between means, and
p adj is the \(p\) value after adjusting for multiple comparisons. Again, only \(\mu_1\) to \(\mu_3\) gives a significant result.