Two-Way analysis of variance (ANOVA) Using JASP

When to use a two-way ANOVA

A two-way analysis of variance (ANOVA) is a statistical test that can be used to assess whether the mean value of some outcome depends on two other factors.

Example: we might want to assess if weight depends on nationality and gender.

Here weight is our Dependent variable
Nationality and gender are our two Factors (Independent variables) that we think weight may depend on.

We might also be interested in whether the way that age affects maths scores is the same across all nationalities.

The observations/measurements on the Dependent variable, such as weight, must be Scale data.

For more help on “What test do I need” go to the sigma website statistical worksheets resources page.

Example

An educational researcher collected data on maths scores from adults of different ages living in different countries. They were grouped into nationality according to whether they were from the UK, Europe or the rest of the World. Their age was classified according to whether they fell into three pre-defined age groups: Younger, Middle-aged or Older. The data are shown below and can be downloaded in a CSV file called maths.csv, where Nationality was coded 1=UK, 2=Europe and 3=Rest of World, and Age was coded as 1=Younger, 2=Middle-aged and 3=Older.

A two-way Analysis of Variance (ANOVA) can be used here to test three different sets of hypotheses:

Test 1:

H0: There is no interaction between nationality and age

H1: There is an interaction between nationality and age

The presence of an interaction between nationality and age means that any age effects observed are not the same for all nationalities. Or equally that any differences due to nationality are not the same in every age group.

Test 2:

H0: There is no difference in mean maths score between the three nationality groups

H1: There is a difference in mean maths score between the three nationality groups

Test 3:

H0: There is no difference in mean maths score between the three age groups

H1: There is a difference in mean maths score between the three age groups

Note that in Tests 2 and 3 we assume that any age effects observed are the same for all nationalities. Or equally that any differences due to nationality are the same in every age group.

Hence, we should do Test 1 first since if conclude there is an interaction between nationality and age then we cannot do Tests 2 and 3. If there is an interaction we should skip tests 2 and 3 and look at the mean maths score for each nationality/age combination and report them (we will see how to plot these means in a profile plot).

If we conclude there is no interaction between nationality and age, then we can do Tests 2 and 3.

Using JASP

From the ANOVA Menu, click ANOVA under Classical:

In the next dialogue window, move Maths_score into the Dependent Variable box, and move Nationality and Age to the Fixed Factors box. Also tick Descriptive Statistics (we will use this later).

Examining the Output

First, we undertake Test 1:

H0: There is no interaction between nationality and age

H1: There is an interaction between nationality and age

We examine the p-value in the row labelled Nationality*Age in the table, which assesses the interaction. In our case we have p=0.537. Since this is greater than 0.05, we conclude there is no evidence of an interaction between nationality and age. This means that any age effects observed can be assumed to be the same for all nationalities. Or equally that any differences due to nationality can be assumed to be the same in every age group.

Hence, we can proceed to undertake Tests 2 and 3:

Test 2 examines the nationality effect. To do this we examine the p-value in the row labelled Nationality in the table. Here we have p reported to be less than 0.001. Since this is less than 0.05, we conclude there is evidence of a nationality effect.

Test 3 examines the age effect. Examining the p-value in the row labelled Age we see this is also reported to be less than 0.001 and so we conclude there is evidence of an age effect.

Having established there are differences in maths marks due to both nationality and age, we need to undertake post hoc tests to understand where these differences are.

Post-hoc Tests

To obtain the post hoc tests, scroll down to the Post Hoc Tests tab and move Nationality and Age to the right hand side box. Select the Tukey option by making sure this is ticked.

Open the Marginal Means tab. Move Nationality and Age to the right hand side box.

Open the Descriptives Plots tab. Move Age to the Horizontal Axis box and Nationality to the Separate Lines box.

JASP should have now included the post-hoc test results, the plot and the descriptive statistics.

The post-hoc tests for Nationality are summarised in this table in the output:

Looking down the ptukey column, we can see there are some statistically significant differences, where p is less than 0.05. There is a statistically significant difference between the UK (coded 1) and the World (coded 3), p reported as less than 0.001, and between Europe (coded 2) and the World (coded 3), p=0.002. But there is no evidence of a statistically significant difference between the UK (coded 1) and Europe (coded 2), p=0.656. Hence the UK and Europe are similar, but both are different to the rest of the world.

For Age, the post-hoc tests are summarised in this table in the output:

Again, looking down the ptukey column, we can see there are again some statistically significant differences. There is a statistically significant difference between younger (coded 1) and older people (coded 3), p reported as less than 0.001, and between middle-aged (coded 2) and older people (coded 3), p=0.042. But there is no evidence of a statistically significant difference between the younger (coded 1) and middle-aged people (coded 2), p=0.246. Hence it seems younger and middle-aged people have similar ability in maths, but these are both different to older people.

At the bottom of the output we have the means for each nationality and age group found under the Marginal Means section:

The output also includes the table below, which shows the mean maths marks for each nationality and age combination. For example, the mean mark for those in the UK (coded 1) that are in the younger age group (coded 1) was 50.167.

The above means are those that are plotted is the profile plot we asked for:

Reporting Results

We could write the results as:

“A two-way ANOVA was used to assess the effect of nationality and age on maths scores. There was no evidence of an interaction effect between nationality and age (p=0.537), but there was evidence of differences due to both nationality (p<0.001) and age (p<0.001). There was a statistically significant difference between the UK (M=57.89), p<0.001, and the World (M=72.89) and between Europe (M=60.83) and the World, p=0.002, but no evidence of difference between the UK and Europe (p=0.656). There was also a statistically significant difference between younger people (M=57.44) and older people (M=71.28), p<0.001, and between middle-aged people (M=62.89) and older people, p=0.042. But there is no evidence of a statistically significant difference between the younger and middle-aged people (p=0.246).”

Further Work

Checking assumptions: Normality

For the test to be valid it should be reasonable to assume that the measurements in each age group and each nationality are approximately normally distributed. If we have 30 or more measurements in each nationality or each age group, then we can safely make that assumption, and we need not check this any further. Since we only had 18 measurements in each nationality group or 18 in each age group, we need to do some further assessments.

The assumption of normality is based on what are called the residuals. The residuals are the differences between each observed value and the mean value for that nationality/age sub-group. We therefore just need to check if this single set of residuals can be assumed to be normally distributed. Unfortunately, in the ANOVA option in JASP there is no option to save the residuals, but we can calculate them ourselves.

The screen below shows our data set after we have typed in the mean values in a column called Mean. The column called Residuals then contains the values of the Maths_score for each person minus their Mean value. Recall that earlier we saw that the mean for Nationality group 1, Age group 1 was 50.167, so the first residual was obtained by calculating the observed value of 49 minus the mean value of 50.167 which equals –1.167. You can do the calculations using JASP via the Transform menu and the Compute Variable option, or you can do the calculations using Excel and copy and paste the results into your JASP data set.

From the Descriptives menu, place Residuals in the Variables box:

Under the Basic Plots tab, tick Distribution plots.

Under the Statistics tab, tick Shapiro-Wilk test. Optionally, untick all other automatically ticked boxes.

Normality could be judged by examining the shape of the histogram to see if it makes a roughly symmetric bell-shaped curve.

We can also assess normality using the Shapiro-Wilk test, which appears in the JASP output under Descriptive Statistics. We need a non-significant result, i.e. the sig value (p-value) needs to be greater than 0.05 to be able to assume normality. In our example, p = 0.861 and so we can assume normality.

Checking assumptions: Equal variances

The ANOVA test we undertook also assumes homogeneity (equality) of variance. This essentially means, can we assume the amount by which maths marks vary within each age group is similar, and also the variation in maths marks within each nationality group is similar.

Return to the ANOVA analysis by selecting the ANOVA option at the to left of your analyses dialogue boxes, or by scrolling up your results to the ANOVA results. Under the Assumption Checks tab, tick Homogeneity tests:

Levene’s test is used in JASP to evaluate the homogeneity of variances assumption. In the output we see this table (you may need to scroll up or down in your results to see this):

We examine the p-value. You need the p-value to be greater than 0.05 to be able to assume homogeneity of variances. In our case p=0.620 so this assumption is fine in our case.

For more resources, see sigma.coventry.ac.uk Adapted from material developed by Coventry University Creative Commons License