This tutorial is going to take the theory learned in our Two-Way ANOVA tutorial and walk through how to apply it using SAS. We will be using the `Moore`

dataset, which can be downloaded here.

This data frame consists of subjects in a “social-psychological experiment who were faced with manipulated disagreement from a partner of either of low or high status. The subjects could either conform to the partner’s judgment or stick with their own judgment.” (John Fox, Sanford Weisberg and Brad Price (2018). carData: Companion to Applied Regression Data Sets. R package version 3.0-2. https://CRAN.R-project.org/package=carData).

We are interested in exploring an individuals conformity based on their partner’s status:

- Low Status
- High Status

We are also interested in exploring whether their F-score category (a measure of authoritarianism) affects outcomes or interacts with partner status. Our three treatment levels are:

- Low F-score
- Medium F-score
- High F-score

Our sample size is \(N = 45\). First, let’s inspect the data for outliers or funky distributions. The following boxplot shows the distribution of scores on the conformity variable within each combination of `partner_status`

and `fcategory`

. We will use the following code:

```
proc sgplot data=moore;
vbox conformity / category=fcategory group=partner_status;
xaxis values=('low' 'medium' 'high')
run;
```

We get this figure:

We can also get a sense of whether an interaction is present by looking at an interaction plot. An interaction plot shows the means for the outcome within each level of one of the factors, with separate lines for the other factor. Parallel lines indicate that no interaction is present, because the mean differences in the first factor are the same regardless of the level of the other factor. Non-parallel lines mean that an interaction is likely present. In other words, the mean differences on the first factor depend on the the level of the second factor. When we calculate the two-way ANOVA using the `glm`

procedure, it will generate an interaction plot.

```
proc glm data=moore;
class fcategory (ref = "high") partner_status (ref = "high");
model conformity = fcategory | partner_status;
run;
```

This syntax specifies that we have two categorical variables, `fcategory`

and `partner_status`

. The `ref`

options are not required but are specified here so that the figures that are output from the procedure are ordered from low to high. If we were to interpret the results as a regression model, the `ref`

options specify which level of the categorical variables we would treat as the reference value. Finally, the `model`

statement specifies that `conformity`

is the outcome, and `fcategory | partner_status`

means we want both main effects and the interaction for our treatment variables.

We will get two tables with class level information, three tables with the ANOVA results, and an interaction plot. First, let’s look at the interaction plot.

Looking at the plot, there appears to be an interaction between `partner_status`

and `fcategory`

. The difference in means in the two partner status levels is small when F-score category is high but larger when the F-score category is medium or low.

Now let’s go back and look at the tables. The first two tables give the class level variables (`fcategory`

and `partner_status`

) and their possible levels (`high`

, `medium`

, `low`

and `high`

, `low`

), and the number of observations used (\(n=45\)).

The next three tables give the results of the ANOVA. The first gives the result of the overall model. We see that \(p=0.007\), so at least one of the terms is significant. The next table gives the model fit. A higher \(R^2\) value generally indicated better fit. The last table gives the ANOVA results broken down by main effects and interactions using type III sum of squares. For more information on the types of sum of squares, see our tutorial here.

The first thing to investigate is the significance of the interaction. If it is non-significant, we can proceed to looking at the main effects. However, if the interaction is significant, the main effects will not be very helpful, as we will need to explore when each factor is significant given levels of the other factor. We indeed find that \(p = 0.023\) for `fcategory*partner_status`

. This tells us that the effect of partner status will depend on levels of F-category, or vice versa. We will want to determine when each factor is significant.

The next step is to determine the nature of the interaction. For example, in the interaction plot above we saw that the effect of `partner_status`

appeared weak among the `factory = high`

group, but it was larger for the other two `fcategory`

levels. Perhaps `partner_status`

has a significant effect on conformity, but *only* when `fcategory`

is not high. To test this, we’ll need to perform something akin to a one-way ANOVA for `partner_status`

within each level of `fcategory`

. However, the correct F-test will utilize all of the information from the full two-way factorial model. To get this test, it is necessary to use an `lsmeans`

statement with our model. We will add the following line of code to our `proc glm`

statement:

`lsmeans fcategory*partner_status / slice=fcategory;`

So we will run:

```
proc glm data=moore;
class fcategory (ref = "high") partner_status (ref = "high");
model conformity = fcategory | partner_status;
lsmeans fcategory*partner_status / slice=fcategory;
run;
```

The `slice`

term gives a test of the effect of partner status within each fcategory level. We can run this and then look at the `fcategory*partner_status`

effect sliced by `fcategory`

for conformity table:

We can see that partner status is significant within the low (\(p = 0.002\)) and medium (\(p = 0.012\)) F-score category groups. The effect of partner status is not significant when F-category is high. This corroborates our theory from the interaction plots. Because `partner_status`

only has two levels, we do not need to do any further pairwise comparisons.