In the same way as DV ~ IV is R formula for linear regression or one way ANOVA, the formula for two-way ANOVA is DV ~ IV1 * IV2 or DV ~ IV1 + IV2 + IV1 : IV2
Especially the latter one illustrates that we're looking at two predictors and the interaction between those.
Now, we'll apply it to ToothGrowth dataset provided in R.
First, load it:
Look at the dataset:
Dose is treated as numerical variable, instead we want it to be categorical, let's code the factors
> ToothGrowth$dose = factor(ToothGrowth$dose, levels=c(0.5,1.0,2.0), labels=c("low","med","high"))
See what changed:
Start with some descriptive statistics:
> boxplot(len ~ supp * dose, data=ToothGrowth,
+ ylab="Tooth Length", main="Boxplots of Tooth Growth Data")
You might also check histograms of each factor combination to decide if the mean is an appropriate summary statistic. If this looks okay a useful way to summarise the overall pattern of two-way data is an interaction plot:
> interaction.plot(x.factor=dose, trace.factor=supp, response=len, fun=mean, type=”b”)
And test the assumption that variances are equal:
> bartlett.test(len ~ supp * dose, data=ToothGrowth)
Now, let's look at the model and the summary:
also, look at summary.lm() output to see all coefficients.
Same as for one way ANOVA we can also call aov() function:
> aov.out = aov(len ~ supp * dose, data=ToothGrowth)
This allows us to look at the results of planned contrasts:
Note that the interaction is significant so we should be careful about interpreting the main effects. We can run post-hoc analysis between all 3*2=6 conditions:
Or we can restrict it to one predictor (dose):
> TukeyHSD(aov.out, which=c("dose"))
We can test for the effect without the interaction:
Or could do with interaction but without one main effect
And could compare whether one model explains significantly more than the other:
In fact the ANOVA test between models assumes that the simpler model is a strict subset of the other. Is that true in this case?