The two-way ANOVA (analysis of variance) is a statistical method that allows to evaluate the simultaneous effect of two categorical variables on a quantitative continuous variable.
The two-way ANOVA is an extension of the one-way ANOVA since it allows to evaluate the effects on a numerical response of two categorical variables instead of one.
The advantage of a two-way ANOVA over a one-way ANOVA is that we test the relationship between two variables, while taking into account the effect of a third variable. Moreover, it also allows to include the possible interaction of the two categorical variables on the response to evaluate whether or not they act jointly on the response variable.
The advantage of a two-way over a one-way ANOVA is quite similar to the advantage of a multiple linear regression over a correlation:
- The correlation measures the relationship between two quantitative variables. The multiple linear regression also measures the relationship between two variables, but this time taking into account the potential effect of other covariates.
- The one-way ANOVA tests whether a quantitative variable is different between groups. The two-way ANOVA also tests whether a quantitative variable is different between groups, but this time taking into account the effect of another qualitative variable.
Previously, we have discussed about one-way ANOVA in R. Now, we show when, why and how to perform a two-way ANOVA in R.
Before going further, I would like to mention and briefly describe some related statistical methods and tests in order to avoid any confusion:
- A Student’s t-test is used to evaluate the effect of one categorical variable on a quantitative continuous variable, when the categorical variable has exactly 2 levels:
- Student’s t-test for independent samples if the observations are independent (for example: if we compare the age between women and men)
- Student’s t-test for paired samples if the observations are dependent, that is, when they come in pairs (it is the case when the same subjects are measured twice, at two different points in time, before and after a treatment for example)
- To evaluate the effect of one categorical variable on a quantitative variable, when the categorical variable has 3 or more levels:1
- one-way ANOVA (often simply referred as ANOVA) if the groups are independent (for example a group of patients who received treatment A, another group of patients who received treatment B, and the last group of patients who received no treatment or a placebo)
- repeated measures ANOVA if the groups are dependent (when the same subjects are measured three times, at three different points in time, before, during and after a treatment for example)
- A two-way ANOVA is used to evaluate the effects of 2 categorical variables (and their potential interaction) on a quantitative continuous variable. This is the topic of the post.
- Linear regression is used to evaluate the relationship between a quantitative continuous dependent variable and one or several independent variables:
- simple linear regression if there is only one independent variable (which can be quantitative or qualitative)
- multiple linear regression if there is at least two independent variables (which can be quantitative, qualitative, or a mix of both)
- An ANCOVA (analysis of covariance) is used to evaluate the effect of a categorical variable on a quantitative variable, while controlling for the effect of another quantitative variable (known as covariate). ANCOVA is actually a special case of multiple linear regression with a mix of one qualitative and one quantitative independent variable.
- A mixed ANOVA is used to test differences between two or more groups whilst subjecting participants to repeated measures: one factor (a fixed effects factor) is a between-subjects variable (for example, treatment A and B, with patients receiving only one of the two treatments) and the other (a random effects factor) is a within-subjects variable (for example, measurements are made on day 1, day 2 and day 3 on all subjects).
In this post, we start by explaining when and why a two-way ANOVA is useful, we then do some preliminary descriptive analyses and present how to conduct a two-way ANOVA in R. Finally, we show how to interpret and visualize the results. We also briefly mention and illustrate how to verify the underlying assumptions.