Meta-Analysis of Continuous Outcomes Using R (Using RStudio)

This worksheet focuses on a meta-analysis of continuous data which includes outcomes measured as scale data such as weight, calories, or anything where the outcomes are reported using means and standard deviations.

Example Data

We use the results of four studies (research papers) that examined a method called “SMI” (Suboccipital Muscle Inhibition) versus “Other” methods to improve the flexibility of the knee joint in adults. The outcome of interest was the increase in the angle the knee can be moved (Popliteal Knee Angle) in degrees, after using either “SMI” or an “Other” technique. Hence the data are measured in degrees (for angles) and so are continuous data. The studies looked at the change in knee angle, but note that this is measured as Pre-treatment minus Post-treatment (not post-pre) since a reduction is a good thing so a reduction would be recorded as a positive change.

The data below show the mean change for each group (pre-post). It also shows the standard deviation of the change which we need and the sample size in each group. The data were extracted from the information given in the papers. Note that some of our data is missing.

		Experimental Group			Control Group
Study ID	Mean Change	SD	n	Mean Change	SD	n
Kuan 2019	7.41	7.12	27	5.37	6.78	27
Aparicio 2009	4.14		34	0.85		34
Cho 2015	5.5	6.6	25	2.3	5.0	25
Joshi 2018	9.5		20	9

For help on how to extract the above data from the published papers, please see the accompanying resource that looks at extracting data for meta-analyses.

It was not possible to obtain the standard deviation for the change in knee angle for two of the studies due to a lack of information in the papers. There are various options open to you to “estimate” these which are discussed in Chapter 10 of the Cochrane Handbook. However, for simplicity we will estimate the missing standard deviations using the information we do have. For example the estimate for the missing SD for the SMI group, could be based on the mean of the values of 7.12 and 6.6 we do have from other two studies. A simple way of doing this is (7.12+6.6)/2 = 6.86. There are perhaps better ways of doing this but the aim here is to keep things simple. Hence for Aparicio (2009) and Joshi (2018) we use 6.86 as the estimate for the Experimental group.

Similarly for the Control group we estimate the missing SDs as (6.78+5.0)/2 = 5.89. Hence for Aparicio (2009) and Joshi (2018) we use 5.89 as the estimated SD for the Control group.

Entering the data into RStudio

The data can be found in the csv file called Knee.csv. It contains the data shown in the previous table, but also includes the following extra variables that have been calculated from that data: Mean Difference, Pooled Standard Deviation, Standard Error of Mean Difference, Cohen’s d and Standard Error of Cohen’s d.

Note: When undertaking a meta-analysis on your own data, there is an equivalent Excel spreadsheet called Meta-analysis_continuous_data.xlsx which you can use to automatically calculate the extra variables shown above, when your mean, standard deviation and n for your two groups are inputted into that same Excel file (you do not need to calculate these yourself). You can then copy and paste that data into JASP to have a data set that looks like that above.

Importing the data into R

To get started with the analysis, first, bring the dataset into RStudio. To do this you can either run a read.csv() function if you know how to do this or alternatively you can follow the following steps using the menus:

From the File menu select Import Dataset then From Text(base):

From the pop-up window navigate to the folder where you have saved the dataset, then once the file was selected click “Open”:

At the next dialogue box (see below), in the upper left corner in the “Name” field, amend the name of your dataset if you wish, in this example we named it as “knee”.

We should also make sure the Heading option below is set to Yes (otherwise the data will all be read in as text):

Finally, click on “Import” to complete the process. This imports the data set and is listed in the “Environment” in the top right of your RStudio screen as follows:

Running the Meta Analysis

To run the meta analysis, we will use the function metagen(). First, install the package meta. Then, apply the library() function to this.

install.packages("meta", repos="https://cloud.r-project.org")  
library(meta)

Once this is installed, we can use the metagen() function. We need to input the Cohen’s d, SE of Cohen’s d and Author values, as well as the data and the specified method, which in this case is Standardised Mean Difference.

meta_results <- metagen(TE = Cohens.d,
                      seTE = SE.of.Cohens.d,
                      studlab = Author,
                      data = knee,
                      sm = "SMD")

We also want to produce a forest plot of the results, which we can obtain using the forest() function:

forest(meta_results)

Understanding the Forest Plot

The numbers on the right-hand side include the estimated effect size using Cohen’s d from each study. For example, Kuan has a Cohen’s d value of 0.29. Using the following benchmark for Cohen’s d (see Cohen, 1988): small (d = 0.2), medium (d = 0.5), and large (d = 0.8), a value of 0.29 is therefore a small effect. The other numbers on the right-hand side then give a 95% confidence interval for the true effect size. Kuan has a lower value of -0.24 and an upper value of +0.83, which includes zero suggesting the true effect size could be positive or negative. Hence, from this study there is no evidence in favour of SMI over other methods. These results are visualised on the Forest Plot on the right-hand side. The square boxes indicate the effect size (Cohen’s d) reported in each study and the lines then the confidence intervals. The top box shows the Cohen’s d value of 0.29 for Kuan and the line indicates the confidence interval from -0.24 to +0.83.

The vertical line at zero indicates the location of the null effect (i.e. no difference). The diamond at the bottom indicates the overall effect and the 95% confidence interval for the overall effect from the meta-analysis. A larger box indicates the meta-analysis gave that study a larger weight and hence made a larger contribution to the overall results. Studies with greater weight have lower variation (i.e. greater accuracy) and a narrower confidence interval. This will often, but not always, be the larger studies.

Reporting Results

All four studies are consistent with a greater increase in knee angle with SMI compared to other methods since the estimated effect sizes (boxes) are all above zero. However, only Aparicio has a 95% confidence interval that does not include zero and hence shows evidence of an effect. Kuan and Joshi both have 95% confidence intervals that do include zero and hence do not show evidence of an effect. Cho has a 95% confidence interval that does just include zero and hence does not quite show evidence of an effect. The reported differences were therefore only statistically significant in Aparicio, but not quite significant in Cho and not at all significant in the other two studies.

We can also obtain values for the overall effect and p-value. We do this using the results from the previous meta analysis. First, we summarise the results using the summary() function. Next, we extract the overall effect size and the p-value using the dollar sign $.

summary <- summary(meta_results)
overall_effect_size <- summary$TE.random
overall_effect_size

## [1] 0.3824351

p_value <- summary$pval.random
p_value

## [1] 0.00601261

The overall effect (Cohen’s d) is shown to be d=0.382, and the p-value for the overall effect is p=0.006. The overall meta-analysis therefore shows that there is evidence of a statistically significant effect (p=0.006) with a small to medium effect (d = 0.38) indicating an improvement in knee angle with SMI compared to other methods. It is worth noting that the 95% Confidence Interval for the overall effect is quite wide with a lower value of 0.11 (see Forest Plot above) suggesting a small effect is possibly the true effect.

Checking Heterogeneity

The Cochrane Handbook (section 9.5) suggest the follow interpretation of the $I^2$ values.

I²	Interpretation
0% to 40%	Might not be important
30% to 60%	May represent moderate heterogeneity
50% to 90%	May represent substantial heterogeneity
75% to 100%	Considerable heterogeneity

From the forest plot we can see that $I^2$ = 0.00 or 0% so there is little or no heterogeneity evident. This supports the reliability of our results.

References

Borenstein, M., Hedges, L., Higgins, J. and Rothstein, H. (2009). Introduction to Meta-Analysis. John Wiley & Sons.

Tufanaru, C., Munn, Z., Stephenson, M. and Aromataris, E. Fixed or random effects meta-analysis? Common methodological issues in systematic reviews of effectiveness. International Journal of Evidence-Based Healthcare 13(3):p 196-207, September 2015. DOI: 10.1097/XEB.0000000000000065

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.

For more resources, see sigma.coventry.ac.uk Adapted from material developed by Coventry University Creative Commons License