Meta-Analysis of Continuous Outcomes Using R (Using RStudio)

This worksheet focuses on a meta-analysis of categorical outcomes. This includes binary outcomes such as increased or not, success or failure, presence or absence of something etc.

Example Data

Our example uses the results of three studies that examined (amongst other things) the presence/absence of the TT genotype* in two groups of participants. The treatment or experimental group of interest were those with hypertension (raised blood pressure) and the control group were those with no history of hypertension. We want to know if the TT genotype is somehow linked with having hypertension (raised blood pressure).

Imagine your friend is looking at a book and you ask them if the second word in the first sentence is the word “was”. We are asking if at a specific point in the book the word is “was” (yes or no). We are sort of asking the same thing here but this time in relation to the genome and asking if at a particular point in the genome the genotype is “TT” (yes or no).

The data are shown in the table below. For the first study by Say et al. (2005), the Hypertension (Treatment) group, the number of patients with the TT genotype was 22 and the number without the TT genotype was 79. For the Control group (No Hypertension) the number with the TT genotype was 10 and the number without was 77.

Study ID	Hypertension Group TT	Hypertension Group Non TT	Hypertension Group Control Group TT	Hypertension Group Non TT
Say et al 2005	22	79	10	77
Rodriguez-Perez et al 2001	87	212	60	255
Cheng et al 2012	165	135	69	81

For help on how to extract the above data please see the resource that looks at extracting data for meta-analyses.

Entering the data into RStudio

We can manually enter the data in RStudio by creating a data frame:

data <- data.frame(
  Study_ID = c("Say et al 2005", "Rodriguez-Perez et al 2001", "Cheng et al 2012"),
  Hypertension_TT = c(22, 87, 165),
  Hypertension_Non_TT = c(79, 212, 135),
  Control_TT = c(10, 60, 69),
  Control_Non_TT = c(77, 255, 81)
)

Running the Meta Analysis

First, install the meta package using the install.packages() function. This will add functions specific to meta-analysis that are not available in the default installation of RStudio. Then, load the package using the library() function.

install.packages("meta", repos="https://cloud.r-project.org")  
library(meta)

We can then use the metabin() function, inputting our TT variables, as well as study label and the data. We also need to specify the specified method, which in this case is odds ratio. The results will be stored as meta_results.

meta_results <- metabin(
  event.e = Hypertension_TT, 
  n.e = Hypertension_TT + Hypertension_Non_TT, 
  event.c = Control_TT, 
  n.c = Control_TT + Control_Non_TT, 
  studlab = Study_ID, 
  data = data, 
  sm = "OR")

We also want to produce a forest plot of our results. We can do this using the forest() function.

forest(meta_results)

Understanding the Forest Plot

The numbers on the right-hand side include the estimated effect size using the Odds Ratio (OR) from each study. For example, Say et al. has an OR of 2.14. The column labelled ‘95%-CI’ then givse a 95% confidence interval for the true Odds Ratio. Say et al. has a lower value of 0.95 and an upper value of 4.82, suggesting the true Odds Ratio could be greater than 1 or less than 1. These results are visualised on the Forest Plot on the right-hand side. The square boxes indicate the Odds Ratio reported in each study and the lines then show the confidence intervals. The top box shows the OR of 2.14 for Say et al. and the line either side of that top box indicates the confidence interval going from 0.95 to 4.82.

The vertical line at 1 indicates the location of the null or no effect. It is a 1 since an OR of 1 means that the incidence of the TT genotype could be the same in both groups. The diamond at the bottom indicates the estimated OR from the overall meta-analysis results, and the line either side of the diamond indicates the 95% confidence interval for this. A larger box indicates the meta-analysis gave that study a larger weight and hence made a larger contribution to the overall results. Studies with greater weight have lower variation (i.e. greater accuracy) and a narrower confidence interval. This will often, but not always, be the larger studies.

Interpreting the Results

All three studies are consistent with the incidence of the TT genotype being higher with the Hypertension group compared to the Control group, since the estimated odds ratios are all above 1. If the Odds Ratios had been less than 1 then we would have concluded there was a lower incidence with the Hypertension group.

However, only the Rodriguez-Perez study displayed a statistically significant effect, indicated by the p value of 0.00 being below 0.05. This p-value of 0.00 is not zero but is zero when rounded to two decimal places – it could have been, say 0.0004, which would be 0.00 when rounded to two decimal places. We usually report this sort of p-value as “p<0.001” and not p=0.00. Note also that the 95% confidence interval here does not include 1 and only covers values above 1. This also shows that this paper found evidence that the incidence of the TT genotype is higher with the Hypertension group compared to the Control group. The other two studies were not quite significant since their p-values are both 0.07 and hence just above 0.05. Note these last two studies also have 95% confidence intervals that do include 1 also indicating that the true odds ratio could be 1, meaning no difference.

However, the overall meta-analysis does show that there is a statistically significant effect since the p-value (also reported as 0.00 when rounded to two decimal places). The Odds Ratio of 1.64 is greater than 1 which indicates that there is a higher incidence of the TT genotype with the Hypertension group compared to the Control group.

The Odds Ratio value of 1.64 also tells us that the odds of a hypertensive person having the TT genotype are 1.64 times those of the odds for someone without hypertension. The 95% Confidence Interval for the Odds Ratio indicates that the true ratio could be between 1.27 and 2.12. For more help with odds and Odds Ratios see Borenstein (2009).

Checking Heterogeneity

The Cochrane Handbook (section 9.5) suggest the follow interpretation of the \(I^2\) values.

I²	Interpretation
0% to 40%	Might not be important
30% to 60%	May represent moderate heterogeneity
50% to 90%	May represent substantial heterogeneity
75% to 100%	Considerable heterogeneity

From the forest plot we can see that \(I^2\) = 0.00 or 0% so there is little or no heterogeneity evident. This supports the reliability of our results.

References

Borenstein, M., Hedges, L., Higgins, J. and Rothstein, H. (2009). Introduction to Meta-Analysis. John Wiley & Sons.

Tufanaru, C., Munn, Z., Stephenson, M. and Aromataris, E. Fixed or random effects meta-analysis? Common methodological issues in systematic reviews of effectiveness. International Journal of Evidence-Based Healthcare 13(3):p 196-207, September 2015. DOI: 10.1097/XEB.0000000000000065

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.

For more resources, see sigma.coventry.ac.uk Adapted from material developed by Coventry University Creative Commons License