Two-Way analysis of variance (ANOVA) using SPSS

When to use a two-way ANOVA

A two-way analysis of variance (ANOVA) is a statistical test that can be used to assess whether the mean value of some outcome depends on two other factors.

Example: we might want to assess if weight depends on nationality and gender.

Here weight is our Dependent variable
Nationality and gender are our two Factors (Independent variables) that we think weight may depend on.

We might also be interested in whether the way that age affects maths scores is the same across all nationalities.

The observations/measurements on the Dependent variable, such as weight, must be Scale data.

For more help on “What test do I need” go to the sigma website statistical worksheets resources page.

Example

An educational researcher collected data on maths scores from adults of different ages living in different countries. They were grouped into nationality according to whether they were from the UK, Europe or the rest of the World. Their age was classified according to whether they fell into three pre-defined age groups: Younger, Middle-aged or Older. The data are shown below and can be downloaded in an SPSS file called maths.sav, where Nationality was coded 1=UK, 2=Europe and 3=Rest of World, and Age was coded as 1=Younger, 2=Middle-aged and 3=Older.

A two-way Analysis of Variance (ANOVA) can be used here to test three different sets of hypotheses:

Test 1:

H0: There is no interaction between nationality and age

H1: There is an interaction between nationality and age

The presence of an interaction between nationality and age means that any age effects observed are not the same for all nationalities. Or equally that any differences due to nationality are not the same in every age group.

Test 2:

H0: There is no difference in mean maths score between the three nationality groups

H1: There is a difference in mean maths score between the three nationality groups

Test 3:

H0: There is no difference in mean maths score between the three age groups

H1: There is a difference in mean maths score between the three age groups

Note that in Tests 2 and 3 we assume that any age effects observed are the same for all nationalities. Or equally that any differences due to nationality are the same in every age group.

Hence, we should do Test 1 first since if conclude there is an interaction between nationality and age then we cannot do Tests 2 and 3. If there is an interaction we should skip tests 2 and 3 and look at the mean maths score for each nationality/age combination and report them (we will see how to plot these means in a profile plot).

If we conclude there is no interaction between nationality and age, then we can do Tests 2 and 3.

Using SPSS

From the Main Menu, click Analyze then General Linear Model then Univariate:

In the next dialogue window, move Maths_score into the Dependent Variable box, and move Nationality and Age to the Fixed Factor(s) box.

Click on OK to see the results.

Examining the Output

This table is the one we are initially most interested in:

First, we undertake Test 1:

H0: There is no interaction between nationality and age

H1: There is an interaction between nationality and age

We examine the Sig value (p-value) in the row labelled Nationality*Age in the table, which assess the interaction. In our case we have p=0.537. Since this is greater than 0.05, we conclude there is no evidence of an interaction between nationality and age. This means that any age effects observed can be assumed to be the same for all nationalities. Or equally that any differences due to nationality can be assumed to be the same in every age group.

Hence, we can proceed to undertake Tests 2 and 3:

Test 2 examines the nationality effect. To do this we examine the Sig value (p-value) in the row labelled Nationality in the table. Here we have p reported to be less than 0.001. Since this is less than 0.05, we conclude there is evidence of a nationality effect.

Test 3 examines the age effect. Examining the Sig value (p-value) in the row labelled Age we see this is also reported to be less than 0.001 and so we conclude there is evidence of an age effect.

Having established there are differences in maths marks due to both nationality and age, we need to undertake post hoc tests to understand where these differences are.

Post-hoc Tests

To obtain the post hoc tests, from the Main Menu click Analyze then General Linear Model then Univariate, and at the main ANOVA Dialog box, click the Post Hoc button:

In the next dialogue box, move Nationality and Age to the Post Hoc Tests for box, and select the Tukey option:

Next click Continue to return to the main dialogue box.

Back on the main ANOVA dialogue box, click on the Plots button:

In the new window, move Age to the Horizontal Axis box and Nationality to the Separate Lines box and click on Add:

Make sure you have clicked on Add and then click on Continue to return to the main dialogue box.

Back on the main ANOVA dialogue box, click the Options button to obtain the means by age group and nationality:

In the new dialogue window, select Descriptive Statistics:

Then click on Continue to return to the main dialogue box and then click on OK to obtain the results.

SPSS will have performed the ANOVA again a second time for you but this time it will have included the post-hoc test results, the plot and the descriptive statistics.

The post-hoc tests for Nationality are summarised in this table in the output:

Looking down the Sig column, we can see there are some statistically significant differences, where p is less than 0.05. There is a statistically significant difference between the UK and the World, p reported as less than 0.001, and between Europe and the World, p=0.002. But there is no evidence of a statistically significant difference between the UK and Europe , p=0.656. Hence the UK and Europe are similar, but both are different to the rest of the world.

For Age, the post-hoc tests are summarised in this table in the output:

Again, looking down the Sig column, we can see there are again some statistically significant differences. There is a statistically significant difference between younger and older people , p reported as less than 0.001, and between middle-aged and older people, p=0.042. But there is no evidence of a statistically significant difference between the younger and middle-aged people, p=0.246. Hence it seems younger and middle-aged people have similar ability in maths, but these are both different to older people.

The table below shows the mean maths marks for each nationality and age group.

The above means are those that are plotted is the profile plot we asked for:

Reporting Results

We could write the results as:

“A two-way ANOVA was used to assess the effect of nationality and age on maths scores. There was no evidence of an interaction effect between nationality and age (p=0.537), but there was evidence of differences due to both nationality (p<0.001) and age (p<0.001). There was a statistically significant difference between the UK (M=57.89, SD=9.634), p<0.001, and the World (M=72.89, SD=11.626) and between Europe (M=60.83, SD=12.692) and the World, p=0.002, but no evidence of difference between the UK and Europe (p=00.656). There was also a statistically significant difference between younger people (M=57.44, SD=11.947) and older people (M=71.28, SD=13.446), p<0.001, and between middle-aged people (M=62.89, SD=9.821) and older people, p=0.042. But there is no evidence of a statistically significant difference between the younger and middle-aged people (p=0.246).”

Further Work

Checking Assumptions: Normality

For the test to be valid it should be reasonable to assume that the measurements in each age group and each nationality are approximately normally distributed.If we have 30 or more measurements in each nationality or each age group, then we can safely make that assumption, and we need not check this any further. Since we only had 18 measurements in each nationality group or 18 in each age group, we need to do some further assessments.

The assumption of normality is based on what are called the residuals. The residuals are the differences between each observed value and the mean value for that nationality/age sub-group. We therefore just need to check if this single set of residuals can be assumed to be normally distributed.

From the Main Menu click Analyze then General Linear Model then Univariate, and at the main ANOVA dialogue box, click on the Save button:

In the next dialogue window, in the Residuals section, click the Unstandardized option and click Continue:

Back on the main dialogue box, click OK to see the results. SPSS will perform the ANOVA again for you, but this time it won’t include any new output, all it will have done is added a new column to your data set as shown below. The new column containing the residuals is called RES_1.

From the SPSS main menu click Analyze then Descriptive Statistics then Explore:

In the dialogue box that opens, place the Residual for Maths variable in the Dependent List box:

Click the Plots button, which opens the Explore: Plots dialogue box. In the new dialogue box, untick Stem-And-Leaf and select Histogram. Then Click the checkbox for Normality plots with tests. Click Continue then OK.

Normality could be judged by examining the shape of the histogram to see if it makes a roughly symmetric bell-shaped curve.

We can also assess normality using the Shapiro-Wilk test, which appears in the SPSS output immediately above the histogram. We need a non-significant result, i.e. the sig value (p-value) needs to be greater than 0.05 to be able to assume normality. In our example, p = 0.861 and we can assume normality.

Checking Assumptions: Equality of Variances

The ANOVA test we undertook also assumes homogeneity (equality) of variance. This essentially means, can we assume the amount by which maths marks vary within each age group is similar, and also the variation in maths marks within each nationality group is similar.

From the Main Menu click Analyze then General Linear Model then Univariate, and at the main ANOVA dialogue box, click on the Options button:

In the next dialogue window, click the Homogeneity tests option:

Click Continue then OK.

Levene’s test is used in SPSS to evaluate the homogeneity of variance assumption. In the output we see this table:

We examine the sig value (p-value) for the first row (based on means). You need the p-value to be greater than 0.05 to be able to assume homogeneity of variances. In our case p=0.620 so this assumption is fine in our case.

For more resources, see sigma.coventry.ac.uk Adapted from material developed by Coventry University Creative Commons License