Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Meta Prop

Download as pdf or txt
Download as pdf or txt
You are on page 1of 63

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/325486099

How to Conduct a Meta-Analysis of Proportions in R: A Comprehensive Tutorial

Preprint · June 2018


DOI: 10.13140/RG.2.2.27199.00161

CITATIONS READS
0 2,774

1 author:

Naike Wang
Texas A&M University
1 PUBLICATION 0 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

How to Conduct a Meta-Analysis of Proportions in R: A Comprehensive Tutorial View project

All content following this page was uploaded by Naike Wang on 01 June 2018.

The user has requested enhancement of the downloaded file.


Conducting Meta-Analyses of Proportions in R
Naike Wang, M.A.
naike.wang@jjay.cuny.edu
John Jay College of Criminal Justice, CUNY

1
Conducting Meta-Analyses of Proportions in R

Abstract

Meta-analysis of proportions is observational and non-comparative in nature. Rarely have

we seen a study or tutorial demonstrate how a meta-analysis of proportions should be per-

formed using the R programming language. This tutorial intends to fill this gap. The tutorial

consists of two major components: (1) a comprehensive, critical review of the process of con-

ducting a meta-analysis of proportions, in which a number of common practices that possibly

lead to biased estimates and misleading inferences are highlighted (e.g., not taking study size

and within-group estimates of between-study variance into consideration when calculating

mean proportions in the presence of subgroups), and (2) a step-by-step guide to conducting

the analysis using R. The process is described in six stages: (1) setting up the R environment

and getting a sense of the data being analyzed; (2) calculating effect sizes; (3) identifying and

quantifying heterogeneity; (4) constructing forest plots; (5) explaining heterogeneity with

moderator analysis; and (6) assessing publication bias. In the last section (assessing publi-

cation bias), we argued that funnel plot analyses developed for investigating publication bias

in randomized controlled trials may not be suitable for use with meta-analyses of proportions.

Three computational options are incorporated in the code for users to choose from to trans-

form proportional data. The presentation of the tutorial is conceptually oriented, the use of

formulas is kept to a minimum, and a published meta-analysis of proportions is used as an

example to illustrate how to implement the R code and interpret the results of the analysis.

Generic R code is provided for readers to use for their analyses. A video tutorial is also pro-

vided to facilitate learning (watch the video at wangnaike.com).

There are several good reasons for a reader to use this tutorial: (1) one does not need to

purchase expensive statistical software like Comprehensive Meta-Analysis (CMA) to perform

this particular type of meta-analysis, because our code yields exactly the same results as those

CMA delivers; (2) our code yields more accurate estimates for mean proportions in the pres-

ence of subgroups compared to the code other meta-analysis authors have used; (3) our code

allows readers to convert proportional data using three transformation methods (no transfor-

mation, the logit, and the double-arcsine transformation), whereas CMA only performs the

logit transformation; (4) this tutorial will help readers understand why publication bias (due

to non-significant results) is not pertinent in the context of meta-analysis of proportions.

2
Conducting Meta-Analyses of Proportions in R

Contents

1 Introduction 5

2 Setting up the R environment 6

2.1 R and RStudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Setting up the working directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Getting a sense of the data 7

3.1 Illustrative example: Prevalence and epidemiological characteristics of congenital

cataract (Wu et al., 2016) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 Preferred formats for organizing data in Excel . . . . . . . . . . . . . . . . . . . . . . 8

4 Calculating effect sizes 10

4.1 Fixed-effect and random-effects model . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2 Transformation of proportions: the logit transformation and the double-arcsine

transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.3 R code for calculating summary effect sizes . . . . . . . . . . . . . . . . . . . . . . . 15

5 Identifying and quantifying heterogeneity 19

5.1 Overview of heterogeneity: the between-study variance (τ 2 ) . . . . . . . . . . . . . . 19

5.2 Test for heterogeneity (Q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.3 Quantifying heterogeneity (I 2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.4 An important caveat: model selection should not be solely based on heterogeneity

tests and statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.5 R code for outputting results of the heterogeneity test and statistics . . . . . . . . . 23

6 Creating and interpreting forest plots 24

6.1 Visual inspection of forest plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.2 Identifying outliers with formal tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.3 R code for creating forest plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.4 R code for identifying outlying and influential studies . . . . . . . . . . . . . . . . . 28

6.5 R code for removing outlying studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

7 Explaining heterogeneity with moderator analyses 33

7.1 Overview of moderator analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3
Conducting Meta-Analyses of Proportions in R

7.2 Subgroup analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7.3 An important caveat regarding obtaining an overall summary effect size in the pres-

ence of subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7.4 Meta-regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7.5 Visualizing moderator analysis: Scatter plots . . . . . . . . . . . . . . . . . . . . . . 36

7.6 An important caveat: Results of moderator analyses cannot be seen as causal evi-

dence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7.7 R code for calculating subgroup summary proportions, conduct subgroup analysis,

and recalculate the overall summary proportion . . . . . . . . . . . . . . . . . . . . . 37

7.8 R code for creating forest plots in the presence of subgroups . . . . . . . . . . . . . . 45

7.9 R code for conducting meta-regression . . . . . . . . . . . . . . . . . . . . . . . . . . 49

7.10 R code for visualizing moderator analyses . . . . . . . . . . . . . . . . . . . . . . . . 50

8 The issue of publication bias in meta-analyses of proportions 55

8.1 Overview of publication bias in the context of meta-analyses of proportions . . . . . 55

8.2 Detecting publication bias through visual inspection of funnel plots in meta-analyses

of randomized controlled trials (RCTs) . . . . . . . . . . . . . . . . . . . . . . . . . . 55

8.3 An important caveat: Funnel plot asymmetry does not equal to publication bias . . 56

8.4 Detecting publication bias with formal tests: rank correlation test, Egger’s regres-

sion test, and trim-and-fill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

8.5 An important caveat: A significant p-value is not indicative of the presence of pub-

lication bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

8.6 R code for creating funnel plots and perform asymmetry tests . . . . . . . . . . . . . 59

References 62

4
Conducting Meta-Analyses of Proportions in R

1 Introduction

A meta-analysis statistically synthesizes the quantitative findings from multiple studies that inves-

tigate the same research question, providing a numerical summary and estimate of a research area

in an effort to better direct future work. Proportion is defined as the number of cases of interest

in a sample with a particular characteristic, divided by the size of the sample (Lipsey & Wilson,

2001). Meta-analyses of proportions synthesize a one-dimensional binomial measure known as the

(weighted) average proportion (Nyaga et al., 2014), which is an average of the results (i.e., pro-

portions) of multiple studies weighted by the inverse of their sampling variances using either the

fixed-effect or random-effects model (Wang & Liu, 2016).

While most meta-analyses focus on effect size metrics that are indicative of a relationship between

a treatment group and a control group, be it standardized mean difference, odds ratio, relative

risk, or risk difference, a meta-analysis of proportions has the goal of obtaining a more precise es-

timate of the overall proportion for a certain case or event (Borenstein et al., 2009; Barendregt, et

al., 2013).

Studies included in meta-analyses of proportions are observational and non-comparative (i.e.,

single-arm). In other words, each study contributes a number of “successes” and a sample size

(Hamza et al., 2008). For instance, a meta-analysis can be conducted to integrate several esti-

mates for the proportion of individuals who suffer from both post-traumatic stress disorder and

substance use disorder in samples of homeless veterans in various urban areas.

Meta-analyses of proportions are commonly conducted in a variety of prominent fields, such as

medicine (Gillen, 2010), clinical psychology (Fusar-Poli et al., 2016), epidemiology (Wu et al.,

2016), public health (Keithlin et al., 2014), etc. Results from these studies are frequently used for

decision models (Hunter et al., 2014).

Many researchers have used the R programming software (R Development Core Team, 2017) to

conduct meta-analysis of proportions (e.g., Ammirati et al., 2013; Fusar-Poli et al., 2016; Wu et

al., 2016). The reason why people choose R over other software to conduct meta-analyses is pri-

marily because it is free. With other statistical software, one has to pay an expensive license fee

for a limited period of use: you can usually use the software for a year, after which you have to

pay again in order to renew your license. Thus, you may no longer have access to the programs

you learned when you leave school. Additionally, there is a growing library of R packages (i.e.,

extensions to R) that have been developed for all kinds of specialized applications including meta-

analysis. This remarkable feature opens up all sorts of possibilities and flexibility in terms of ma-

5
Conducting Meta-Analyses of Proportions in R

nipulating data, especially in the case of meta-analyzing proportions.

It is important to note that we usually need to apply transformations to proportional data in or-

der to improve their statistical properties in terms of their distribution and variance. Two of the

most commonly used data transformation methods are the logit and the double-arcsine transfor-

mation (not transforming data is also appropriate under certain circumstances). We will discuss

this in more detail below. Both the metafor (Viechtbauer, 2010) and meta package (Schwarzer

et al., 2015) can perform these transformations whereas other statistical software specifically de-

signed to perform meta-analyses, such as Comprehensive Meta-Analysis (Borenstein et al., 2005)

and MedCalc (Schoonjans, 2017), are only able to perform one of them. Additionally, Comprehen-

sive Meta-Analysis and MedCalc will automatically transform data, but users can decide whether

data should be transformed in R.

Rarely have we seen any R tutorials that show how to conduct a full meta-analysis of proportions,

either in the literature or on the Internet. The purpose of this tutorial is to provide an introduc-

tion on how to perform meta-analyses of proportions in R. As far as we know, this is the first tu-

torial that demonstrates how to do this. This tutorial overviews core statistical constructs and

issues in relation to meta-analyses of proportions and shows an example to illustrate how to con-

duct the analysis using data extracted from a published meta-analytic study in R. Due to space

limitations, we only give one example to show the readers how data transformation can be incor-

porated properly as an integral part in a meta-analysis of proportions study. The R code has been

validated by CMA. The results yielded by the two software are identical.

Please note that this tutorial is designed for medium to advanced-level R users. we assume readers

have a basic understanding of the principles of meta-analysis; in particular, we will not discuss the

processes of searching the literature and collecting, coding, and extracting data.

2 Setting up the R environment

2.1 R and RStudio

First things first, you need to download R. The base R program (the latest version is 3.4.1) can

be downloaded for free from the Comprehensive R Archive Network, (https://cran.r-project.org/).

R provides a basic graphical user interface (GUI), but we would recommend that readers use a

more productive code editor that interfaces with R known as RStudio. This is a development en-

vironment that is built to make using R as effective and efficient as possible. It adds much more

6
Conducting Meta-Analyses of Proportions in R

functionality above and beyond R’s bare bones GUI. To use RStudio, you have to install the lat-

est version of R. RStudio is available for free at https://www.rstudio.com/. Once RStudio is in-

stalled on your computer, the first thing we do is to create a new R Script or Markdown file. we

would always recommend readers code in Markdown files. The reason for this is beyond the scope

of this article, but we suggest readers explore their website to get a sense of the unique features of

R Markdown.

2.2 Setting up the working directory

Next, readers need to set the programming environment by defining a working directory for the

current R session. A working directory is a place (e.g., a folder) for reading in your data and sav-

ing all of your work, such as code you have written, any plots produced, etc. It is useful for keep-

ing your work organized. To set up a working directory, type the following code in the RStudio

source editor (the upper left area):

setwd("C:/ data")

It is worth noting that when an .rmd file is created in an RStudio Project, the default directory is

the the folder containing this particular .rmd file.

3 Getting a sense of the data

3.1 Illustrative example: Prevalence and epidemiological characteristics


of congenital cataract (Wu et al., 2016)

We will illustrate the conduct of a full meta-analysis of proportions using data extracted from a

published meta-analytic study. The purpose of this exercise is to illustrate the process of meta-

analyzing proportions and not to answer substantive research questions. The example employs

data from a meta-analysis conducted by Wu et al. (2012). In their meta-analytic study, Wu et

al. (2012) estimated the prevalence of congenital cataracts (CC) and their main epidemiological

traits. CC refers to opacity of the lens detected at birth or at an early stage of childhood. It is the

primary cause of treatable childhood blindness worldwide. Current studies have not determined

the etiology of this condition. The few large-scale epidemiological studies on CC also have limita-

tions: they involve specific regions, limited populations and partial epidemiological variables. Wu

et al. (2016) aimed to explore its etiology and estimate its population-based prevalence and ma-

7
Conducting Meta-Analyses of Proportions in R

jor epidemiological characteristics, morphology, associated comorbidities and etiology. The origi-

nal dataset consists of 27 published studies that were published from 1983 to 2014, among which

17 contained data on the population-based prevalence of CC, 2 hospital-based studies and 8 CC-

based case reviews. Samples investigated in the studies were from different regions of the world,

including Europe, Asia, the USA, Africa, and Australia. The sample sizes of the included studies

ranged from 76 to 2,616,439 patients, with a combined total of 8,302,708 patients. The diagnosed

age ranged from 0 to 18 years of age. The proportions were transformed by the logit transforma-

tion, the most commonly used transformation when dealing with proportional data that results in

a more normal sampling distribution with a mean of zero, and a standard deviation of 1.83. The

authors coded 5 moderators, including world region (China vs. the rest of the world), study de-

sign (birth cohort vs. other), sample size (less vs. more than 100,000), diagnosed age (older vs.

younger than 1 year old), and research period (before vs. after the year 2000). All of these poten-

tial moderators are categorical variables. For the present tutorial, we will work with only a subset

of the moderating variables that were employed in the example study, including study design and

sample size.

3.2 Preferred formats for organizing data in Excel

Prior to performing a meta-analysis in R, it is first important to properly organize the data file.

An excerpt of the example data set that is used in this tutorial is shown in Table 1. In this ta-

ble, each row represents the data extracted from each independent study included in the current

meta-analysis and each column represents the variables that are mandatory to create in order to

properly compute effect sizes, create plots, and conduct further analysis. Following is an excerpt

of the dataset extracted from Wu et al. (2016).

We have separate columns for author names and publication years. This will be useful when we

need to sort studies according to the publication year in R. We also need to create a column con-

taining both author names and publication years, if we decide to use the forest() command in the

meta package to create forest plots. This is because it is not able to combine the author names

and the years of publication in a row automatically. In this case, the column is labeled autho-

ryear. Note that, when a data file is imported into R, if our column names contain letters in up-

percase, they will be converted to lowercase, which means that we are not able to use uppercase/low-

ercase letters to distinguish between different columns. We need to use different column names to

distinguish between them. In addition, we cannot leave a blank between two words. As can be

8
Conducting Meta-Analyses of Proportions in R

seen in the table, we use authoryear instead of author year, studydesign instead of study design,

samplesize instead of sample size. The variable cases represents the number of the event of inter-

est in the sample of each study. The variable total represents the sample size of each study. When

we divide the values that are contained in cases by the ones in total respectively, we will obtain

the proportions we need to compute effect sizes, which are labeled yi in R. R will also calculate

sampling variances based on the data, which are labeled vi. The rest of the variables that are part

of the dataset are potential moderators (I have only included study design and sample size as ex-

amples in the present excerpt of the data file. Readers can download the data file on my Github

site to get access to the full dataset), which will be examined in either a subgroup analysis or a

meta-regression. For instance, study design, as a potential moderator, has two categories or lev-

els: birth cohort and others. We have 1 and 0 coded respectively for each category in the column

labeled studesg. It is actually not mandatory to create the columns labeled studydesign and sam-

plesize , but it would be more convenient for us to include them because we could then see what 1

and 0 actually represent. I should note that we can use either one of the columns to conduct mod-

erator analyses and it will yield exactly the same results except when we want to create a scatter-

plot to visualize the analysis. If we use the studydesign column, R will create a box plot instead

of a scatterplot. For continuous moderators, readers can simply create columns that contain con-

tinuous values (e.g., the year column). We save the data in an Excel spreadsheet in a comma sep-

arated values (.csv) format in the working directory. We name the file data. csv (you can name it

whatever you want).

9
Conducting Meta-Analyses of Proportions in R

4 Calculating effect sizes

4.1 Fixed-effect and random-effects model

Before the meta-analysis can be conducted, we need to make a basic choice between two model-

ing approaches for calculating the summary effect size: the fixed-effect and random-effects model

(Hedges, 1992; Hedges & Vevea, 1998; Hunter & Schmidt, 2000). The fixed-effect model assumes

that studies included in a meta-analysis are functionally equivalent, and thus, they share a com-

mon true effect size. Put differently, the true effect sizes are identical across studies. The only rea-

son the observed effect sizes vary across the studies is due to the random sampling error inherent

in each study, namely, the within-study variance. Put differently, participants in the studies come

from a single common population and go through the same experimental procedures performed

by the same researchers under the same conditions. For instance, a series of studies with the same

protocol conducted in the same lab, sampling from the same population (e.g., school children from

the same class) can qualify for the fixed-effect model. However, these conditions rarely hold in re-

ality. True effect sizes actually vary from study to study because in the vast majority of cases we

include a group of studies on a common topic, but these studies are usually performed in differ-

ent ways, which causes the true effect sizes to vary (Barendregt et al., 2013). Therefore, instead

of being identical across the studies, the true effects follow a normal distribution. The random-

effects model allows the included studies to have true effect sizes that are not identical or “fixed”

but normally distributed. In other words, the random-effects model differs from the fixed-effect

model in the calculation of variance: the fixed-effect model assumes that the between-study vari-

ance does not exist (i.e., the between-study variance is zero), hence differences among observed

effect sizes are solely due to within-study variance, whereas the random-effects model takes both

within- and between-study variances into account. The fact that the fixed-effect model does not

take study heterogeneity or between-study variance into consideration leads to a serious limita-

tion: the conclusions drawn from a fixed-effect meta-analysis are limited to only the set of studies

included in a particular meta-analysis and cannot be generalized to a broader, more general pop-

ulation. However, most social scientists wish to make inferences that extend beyond the included

set of studies in their meta-analyses. As a general rule of thumb, in most meta-analytic studies,

the random-effects model will be more plausible than the fixed-effect model because the random-

effects model allows more generalizable conclusions (Card, 2015; Vevea and Coburn, 2014). But,

we discourage the practice of switching to the random-effects model from the fixed-effect model

only based on the results of heterogeneity tests (Borenstein et al., 2005). We will discuss this in

10
Conducting Meta-Analyses of Proportions in R

more depth later.

The random-effects model can be estimated by three methods (there are other methods, but here

we will focus on the three most popular ones): the method of moments or the DerSimonian and

Laird method (DL; DerSimonian and Laird, 1986), the restricted maximum likelihood method

(REML; Raudenbush and Bryk, 1985), and the maximum likelihood method (ML; Hardy and

Thompson, 1996). In all cases, the summary effect size (i.e., the summary proportion) is esti-

mated as the weighted average of the observed effect sizes of individual studies and the weighting

for each study is the inverse of the total variance of a study, which is the sum of the within-study

variance and the between-studies variance (see formulas below for more details; Ma et al., 2016).

They differ mainly on the estimation of the between-study variance, commonly denoted as τ 2 in

the meta-analytic literature (more on this in the section on heterogeneity). The technical differ-

ences between these methods have been summarized elsewhere (Knapp et al., 2006; Thorlund et

al., 2011) and will not be discussed here.

4.2 Transformation of proportions: the logit transformation and the


double-arcsine transformation

We usually apply transformations to the observed proportions identified across a collection of

studies to make the transformed proportions follow a normal distribution in order to accurately

estimate the summary proportion and increase the validity of the associated statistical analyses,

e.g., tests of significance (Feng et al., 2014; Nyaga, et al., 2014). Nevertheless, when the observed

proportions are around 0.5 and the number of the studies is sufficiently large, the proportions fol-

low a binomial distribution that is approximately symmetrical. Under such circumstances, the

normal distribution is a good approximation of the binomial distribution, and thus the raw pro-

portion is appropriate to be used as an effect size statistic for analysis (Barendregt et al., 2013;

Box et al., 2005; Wang & Liu, 2016). In fact. according to their simulation study, Lipsey and Wil-

son (2001) proposed that when observed proportions derived from individual studies are between

0.2 and 0.8, and only the mean proportion across the studies is of interest, then the raw propor-

tion can work adequately as an effect size index. In addition to the effect sizes, we also need their

corresponding estimates of sampling variance or standard error (SE) to do a meta-analysis. The

effect sizes are weighted by the inverse of their variances. Specifically, a larger study is given more

weight so its effect size has greater impact on the overall mean. The procedure for calculating the

effect size, standard error, sampling variance, and inverse variance weight for individual studies

11
Conducting Meta-Analyses of Proportions in R

using direct (raw) proportions is as follows (Lipsey & Wilson, 2001):

The raw/direct proportion is given by:

k
ESp = p = (1)
n

with its sampling variance: r


p (1 − p)
V arp = SEp2 = (2)
n

and the inverse variance weight:

1 1 n
wp = = 2
= (3)
V arp SEp p (1 − p)

where p is the proportion, k is the number of individuals or cases in the category of interest, n is

the sample size, ES, SE, Var, and w stand for effect size, standard error, sampling variance, and

inverse variance weight, respectively. Then, the weighted average proportion can be computed as

follows:
P
(wi ESi )
ESP = P = P (4)
wi

with its sampling variance:


1
V arP = SEP2 = P (5)
wi

The confident interval of the weighted average proportion can be expressed as follows:

PL = P − Z(1−α) (SEP )
(6)
PU = P + Z(1−α) (SEP )

where Z(1−α) = 1.96 when α = 0.05.

However, proportional data derived from real studies is rarely centered around 0.5 and is, in fact,

often quite skewed (Hunter et al., 2014). In a group of studies collected for a meta-analysis of

proportions, it was found that as the observed proportions get further from 0.5 and approach

closer to the margins, especially when they are less than 0.2 or greater than 0.8, they will become

less likely to be normally distributed (Lipsey & Wilson, 2001). As a result, the normal distribu-

tion will not adequately describe the observed proportions and a continuing assumption of nor-

mality may result in a biased estimation and a misleading or invalid inference (Feng et al., 2014;

Ma et al., 2016). Specifically, using the direct proportion as the effect size statistic in such cases

12
Conducting Meta-Analyses of Proportions in R

may lead to an underestimation of the size of the confidence interval around the weighted average

proportion and an overestimation of the degree of heterogeneity across the observed proportions

(Lipsey & Wilson, 2001). When the distribution of a set of observed proportions is skewed (i.e.,

the observed proportions are extremely high or low), we usually apply transformations to the data

in order to make it conform to the normal distribution as much as possible, enhancing the valid-

ity of the following statistical analyses (Barendregt, et al., 2013). Specifically, after transforming

the observed proportions, all analyses are conducted using the transformed proportion as the ef-

fect size statistic (e.g., the natural logarithm of proportion) and the inverse of the variance of the

transformed proportion as study weight. For reporting, the transformed summary proportion and

its confidence interval are converted back to proportions for ease of interpretation (Borenstein et

al., 2009). In practice, the approximate likelihood approach (Agresti & Coull, 1998) is arguably

the predominant framework in modeling proportional data (Hamza, et al., 2008; Nyaga et al.,

2014). There are two main ways to transform observed proportions in this framework: the logit or

log odds (Sahai & Ageel, 2012) and the Freeman-Tukey double arcsine (Freeman & Tukey, 1950;

Miller, 1978). In the logit transformation, observed proportions are first converted to the natural

logarithm of the proportions (i.e., the logit). After the transformation, the logit transformed pro-

portions are assumed to follow a normal distribution and all analyses are performed on the logit

as the effect size statistic. After the analysis, the logits are converted back into proportions for re-

porting. The procedure for calculating the logit, its standard error and inverse variance weight for

individual studies, and the weighted average proportion using the logit transformation is as follows

(Lipsey & Wilson, 2001):

The logit is calculated by:


   
p p
ESl = loge = ln (7)
1−p 1−p

with its sampling variance:


1 1
V arl = SEl2 = + (8)
np n(1 − p)

and the inverse variance weight:


1
wl = = np (1 − p) (9)
SEl2

To convert the transformed values into the original units of proportions using:

elogit
p= (10)
elogit+1

13
Conducting Meta-Analyses of Proportions in R

Being widely employed in meta-analyses of proportions, the logit transformation still has its lim-

itations in certain situations. Two limitations are highlighted here. First, the variance instabil-

ity that plagues untransformed proportions persists even after the logit transformation (Baren-

dregt, et al., 2013; Hamza, et al., 2008).We transform data in an attempt to make it closer to a

normal distribution, or at least have more constant variance. While the logit transformation cre-

ates a sampling distribution that is more approximately normally distributed, it fails to stabilize

the variance, and thus may place undue weight on studies (Barendregt, et al., 2013; Hamza, et al.,

2008). According to the equation used to compute the corresponding sampling variance (Eq.8),

for a fixed value of n, the variance changes with p. For instance, consider the following situation

where two studies of the same sample size (e.g., 100), an observed proportion that is close to 0 or

1 gives grossly magnified variance whereas an observed proportion around 0.5 gives squeezed vari-

ance, which results in variance instability (Barendregt, et al., 2013). Further, in situations where

the event of interest is extremely rare (i.e. p=0) or extremely common (i.e., p=1), the logits and

its sampling variances will become undefined. In practice, the common solution is to add an ar-

bitrary constant 0.5 correction to the np and n(1-p) for all studies, including those with no such

problem (Hamza, et al., 2008). However, the inclusion of such studies with a 0.5 continuity cor-

rection has been shown to bias the results even further (Ma et al., 2016). Both of the problems

discussed above can be solved gracefully by employing the variance stabilizing transformation as

proposed by Freeman & Tukey (1950), known as the double arcsine, which is accomplished with

the following equation: r r


1 k k+1
ESt = (sin−1 + sin−1 ) (11)
2 n+1 n+1

The sampling variance is computed by:

1
V art = (12)
4n + 2

The back transformation is computed by the equation as proposed by Miller (1978):

 v" 
u 1 2 #
sin t −

1 u
sin t
p= 1 − sgn (cos t) t 1 − sin t +  (13)
2 n0

where t denotes the double arcsine transformed value or the confidence interval around it with

sgn being the sign operator. In the back-transformation equation (Eq.13), the sample size denoted

by n’ is indicative of using the harmonic mean of individual sample sizes in the inversion formula

14
Conducting Meta-Analyses of Proportions in R

(Miller, 1978). The harmonic mean is defined as:

Xm
n0 = m( n−1
i )
−1
(14)
i

where ni denotes the sample size of each included study and m denotes the number of included

studies. Miller (1978) gives an example in his paper. A meta-analysis of proportions includes 4

studies, whose samples sizes are 11, 17, 21, 6, respectively. The harmonic mean would be:

4
n0 = 1 1 1 1 = 10.9885 (15)
11 + 17 + 21 + 6

Following the suggestions proposed by Lipsey and Wilson (2001) and Viechtbauer (2010), using

direct proportions would be adequate when the observed proportions identified across studies are

between 0.2 and 0.8; applying the logit transformation would be acceptable when the observed

proportions are less than 0.2 or larger than 0.8; the double-arcsine method would be a more ap-

propriate choice when the sample size is small and/or extreme proportions need to be handled.

This tutorial will demonstrate all three transformation methods with R (unfortunately no studies

have been conducted to compare which to use under different circumstances).

4.3 R code for calculating summary effect sizes

We will now begin the first step of our meta-analysis. First, readers need to install and download

required packages, which run within R and contain collections of functions that are needed to per-

form meta-analyses. In this tutorial, we will need to install two packages: metafor (Viechtbauer,

2010) and meta (Schwarzer et al., 2015). This tutorial is primarily based on using metafor because

it allows users to have control over every step of manipulating raw data. We will also use the meta

package because it will be much more convenient to employ the forest() function in meta to create

forest plots. To install these packages, execute the following command:

install.packages(c("metafor", "meta"))

Once readers have installed a package, it is available permanently for use in R. To use the in-

stalled packages, one will need to execute the library() command each time you open R. In other

words, if you close the R program and restart it, you will have to issue this command again in or-

der to use packages. Readers can load the packages that we just installed into the current R ses-

sion with:

15
Conducting Meta-Analyses of Proportions in R

library(metafor)

library(meta)

We then need to import the data saved in the file data.csv and create an object named dat to

store the data in R. This can be achieved by running the following code:

dat=read.csv("data.csv", header=T, sep=",")

To estimate the summary effect size (i.e., the weighted average proportion) , we fit a meta-analytic

model in metafor by employing the escalc(), rma(), and predict() functions. These functions, along

with a number of arguments specified within their parentheses, instruct R on how effect sizes should

be calculated. Some of the arguments are defaults, such as weighted=TRUE, so users do not need

to specify them. With the escalc() function, individual effect sizes and their corresponding sam-

pling variances are estimated by fitting a meta-analytic model. One can decide whether to trans-

form these effect sizes with the measure= argument. In the case of a meta-analysis of proportions,

this function has the following general format:

ies=escalc(xi=cases, ni=total, data=dat, measure="PR"/"PLO"/"PFT")

Here, ies is the name of the object in which the results of the escalc() function will be stored. The

function will yield individual effect sizes yi and their corresponding sampling variances vi based

on the information stored in dat. In this case, we have named this object ies, which stands for

individual effect size. cases is the name of the variable containing the number of the event of in-

terest of each study in dat. total is the variable containing the sample size of each study in dat.

Finally, the measure= argument dictates which computational option will be used to transform

the raw proportions:

measure="PR" #no transformation

measure="PLO" #the logit transformation

measure="PFT" #the double arcsine transformation

We will then pool the individual effect sizes and their sampling variances based on the inverse

variance method with the rma() function. To do this, we can execute:

pes/pes.logit/pes.da=rma(yi, vi, data=ies, method="DL"/"REML")

16
Conducting Meta-Analyses of Proportions in R

If we decide not to perform a transformation, this object is suggested to be named pes, which

stands for pooled effect size; if we decide to perform a transformation, either the logit or the double-

arcsine, it is suggested to be named pes. logit or pes.da, which stands for logit and double-arcsin

transformed summary effect size, respectively. This object will store the results generated by the

rma() function. The function will yield a pooled effect size based on the individual effect sizes and

their sampling variances contained in ies. The method= argument dictates which of the following

between-study variance estimators will be used (the default method is REML):

method="DL" #random effects using the DerSimonian-Laird estimator

method="REML" #random effects using the restricted maximum-likelihood estimator

If unspecified, rma() estimates the variance component using the REML estimator. Even though

rma() stands for random-effects meta-analysis, the function can perform a fixed-effect meta-analysis

with:

method="FE"

The object pes. logit or pes.da now contains the estimate of the transformed summary proportion.

To convert it to its original, non-transformed measurement scale (i.e., proportion) and yield a true

summary proportion, we need to use the predict() function:

pes=predict(pes.logit, transf=transf.ilogit)

pes=predict(pes.da, transf.ipft.hm, targ=list(ni=dat$total))

The transf= argument dictates how the transformed summary effect size should be converted back

to proportion:

transf=transf.ilogit #inverse of logit transformation

transf=transf.ipft.hm, targ=list(ni=dat$total) #inverse of double-arcsine transformation

Note that we use transf. ipft. hm instead of transf. ipft, because we want to use the harmonic

mean of individual sample sizes (explained above). Finally, to see the output for the true sum-

mary proportion and its 95% CI, we use the print() function.

print(pes)

For the sake of readers’ convenience, the generic code for generating results for a fitted random

effects model using three different transformation methods is provided here:

17
Conducting Meta-Analyses of Proportions in R

Option 1: no transformation

ies=escalc(xi=cases, ni=total, data=dat, measure="PR")

pes=rma(yi, vi, data=ies)

print(pes)

Option 2: the logit transformation

ies.logit=escalc(xi=cases, ni=total, data=dat, measure="PLO")

pes.logit=rma(yi, vi, data=ies.logit)

pes=predict(pes.logit, transf=transf.ilogit)

print(pes)

Option 3: the double-arcsine transformation

ies.da=escalc(xi=cases, ni=total, data=dat, measure="PFT", add=0)

pes.da=rma(yi, vi, data=ies.da)

pes=predict(pes.da, transf=transf.ipft.hm, targ=list(ni=dat$total))

print(pes)

Note the use of add=0 in line 11. When a study contains proportions equal to 0, R will automat-

ically add 0.5 to the observed data (i.e., the number of event of interest, namely the cases vari-

able). Since the double-arcsine transformation does not require any adjustments to be made to

the data in such a situation, we can explicitly switch add=0.5 to add=0 to stop the default ad-

justment. Returning to the running example, the summary proportion is generated using option 2

(i.e., the logit transformation) on the grounds that all of the observed proportions in the dataset

are far below 0.2 and there are no zero events. Thus, we would execute:

ies.logit=escalc(xi=cases, ni=total, measure="PLO", data=dat)

pes.logit=rma(yi, vi, data=ies.logit, method="DL", level=95)

pes=predict(pes.logit, transf=transf.ilogit)

print(pes, digits=6)

The argument digits=specifies the number of decimal places to which the printed results should

be rounded (the default is 4). The argument level= specifies the confidence interval (the default

is 95). In this case, if we use the 95% CI, the values of τ , τ 2 , and I 2 will fall outside of the con-

fidence interval. We can switch to the 99% CI to solve this problem. We will however keep us-

ing the 95% CI throughout this tutorial so readers can compare the results computed by our code

18
Conducting Meta-Analyses of Proportions in R

with those obtained by Wu et al. (2014).

The estimates of the summary proportion and its 95% CI are shown in the following output:

Interpreting these summary statistics, we find that the summary proportion is 0.000424 (95%

CI=0.000316, 0.000569).

5 Identifying and quantifying heterogeneity

5.1 Overview of heterogeneity: the between-study variance (τ 2 )

Meta-analyses aim to produce a more precise summary or estimate of effect by synthesizing stud-

ies. An important decision that authors of meta-analyses need to make is whether it makes sense

to combine a set of identified studies in a meta-analysis, given that the studies inevitably differ

in their characteristics to varying degrees. If we were to combine studies whose study estimates

vary substantially from one another, the summary effect we estimate and the conclusion we draw

may not be accurate or valid. For instance, a meta-analysis of proportions concludes that the

summary proportion of juvenile offenders who re-offend in a city falls in a medium range (i.e.,

around 0.5), but considerable variation exists among these proportions. That is, studies conducted

in some boroughs have small (e.g., under 0.1) proportions while others have very large propor-

tions (e.g., above 0.9). In this case, to report that the mean proportion is moderately large would

be misleading because It fails to provide an accurate summary of the core finding that there is

large variation or inconsistency in the observed effect sizes across the studies. This variation is

what is known as heterogeneity (Del Re, 2015; Borenstein et al., 2005). We can quantify hetero-

geneity by dividing it into two distinct components: one is the between-study variance due to the

true or real variation among a body of studies and the other is the within-study variance due to

sampling error. The real variation can be attributed to clinical and/or methodological diversity,

namely systematic differences between studies beyond what would be expected by chance, such

as experimental designs, measurements, sample characteristics, interventions, study settings, and

any combination of such factors (Cornell et al., 2014; Lijmer, 2002; Thompson and Higgins, 2002;

Veroniki et al., 2016). For the purposes of meta-analysis, we are only interested in the true vari-

ation in effect sizes (i.e., the between-study variance). We characterize study heterogeneity by its

standard deviation, τ , a statistic called tau. Under the assumption of normality, 95% of true ef-

19
Conducting Meta-Analyses of Proportions in R

fects are expected to fall within ± 2 × τ 2 of the point estimate of the summary effect size (Cornell

et al., 2014). The between-study variance, τ 2 , called tau-squared, reflects the amount of true het-

erogeneity on an absolute scale (Borenstein et al., 2005). That is, the total amount of systematic

differences in effects across studies. The total variance of a study is the summary of the between-

and within-study variance and is used to assign weights under the random-effects model (i.e., the

inverse of the total variance). τ 2 can be estimated by several methods as we have already men-

tioned in the section dealing with calculating effect sizes (e.g., ML, REML, DL). Researchers have

not come to an agreement as to which of the three methods is the best to estimate the random ef-

fects model. The ML tends to underestimate the between-studies variance if the number of studies

included in a meta-analysis is small (Schwarzer et al., 2015; Nyaga et al., 2014). Simulation stud-

ies (e.g., Chung et al., 2014) have also demonstrated that the DL estimator has the same issue: it

tends to result in downwards biased between-study variance, potentially producing overly narrow

confidence bounds for the summary effect and too small p values when the number of studies is

small or when there is substantive variability in effect sizes (Bowden et al., 2011; Cornell et al.,

2014). Jackson et al. (2010) suggests that the DL procedure is preferable when the sample size is

large and only the mean effect size is of interest. In his book, Borenstein et al. (2005) mentioned

that some statisticians favor the REML method despite its computationally demanding nature

and one of the key authors of the DerSimonian and Laird paper once argued that the DL estima-

tor should not be used any longer. Although the REML estimator has generally been shown to

produce superior results to the DL estimator, a number of studies have shown that the differences

between the results derived by the two approaches are negligible, and thus are rarely sufficiently

pronounced to make a substantive impact on the conclusions that will be drawn from the anal-

ysis (Thorlund et al., 2011). Nevertheless, as all the aforementioned estimators have limitations

as to estimating the amount of true variation in effect sizes, the 95% confidence interval around

the point estimate of τ 2 should be obtained, especially when the number of studies included in a

meta-analysis is small (Veroniki et al., 2016). In practice, the DerSimonian and Laird random ef-

fects model is arguably the most popular statistical method for the meta-analysis of proportions

nowadays and has become the conventional procedure and the default method in many software

packages to gauge the amount of heterogeneity (Cornell et al., 2014; Hamza et al., 2007; Ma et

al., 2016; Schwarzer et al., 2015; Thorlund et al., 2011) because it has the advantage of being the

easiest to compute and explain compared with other methods (Borenstein et al., 2005).

20
Conducting Meta-Analyses of Proportions in R

5.2 Test for heterogeneity (Q)

The first method to identify heterogeneity is through visual inspection of forest plots. We will dis-

cuss this in depth in its own section below. Using formal tests, the presence of study heterogeneity

is generally examined using a formal χ2 test with a statistic Q under the null hypothesis that all

studies share the same true effect (Hedges & Olkin, 1985). In other words, the Q-test and its p-

value serve as a test of significance to address the null hypothesis: H0 : τ 2 = 0. If the value of the

Q-statistic is above the critical χ2 value, we will reject the null hypothesis and conclude that the

effect sizes are heterogeneous. Under such circumstances, you may consider taking the random-

effects model route. If Q does not exceed this value, then we fail to reject the null hypothesis. It

is very important to note that when a non-significant p-value is present, we have to be very cau-

tious in making the conclusion that the true effects are homogeneous. This is because the statisti-

cal power of the Q-test is heavily dependent on the number of studies included in a meta-analysis,

and thus, it may fail to detect heterogeneity simply due to a lack of power when the number of

included studies is small (i.e., less than 10) and/or the included studies are of small size (Huedo-

Medina et al., 2006). Therefore, a non-signicant result (p>0.05) cannot be interpreted as showing

empirical evidence for homogeneity (Hardy & Thompson, 1998). This is an issue that needs to be

taken seriously because it is found that 75% of meta-analyses reported in Cochrane reviews con-

tained five or fewer studies (Davey, 2011). In addition to the aforementioned limitation, the Q-

test only serves to test if the null-hypothesis is viable, it is not able to quantify the magnitude of

true heterogeneity in effect sizes (Card, 2015).

5.3 Quantifying heterogeneity (I 2 )

One statistic of heterogeneity that is not directly influenced by the number of studies, the I 2 statis-

tic proposed by Higgins et al. (2003) can address these issues by estimating the proportion of the

observed heterogeneity that constitutes the true variation between studies. Put differently, in

essence, I 2 is roughly the ratio of between-study variance to the observed variance (i.e., the sum

of between- and within-study variance). Thus, it allows us to compare estimates of heterogene-

ity across meta-analyses, regardless of the original scale used for the meta-analyses themselves. I 2

can take values from 0% to 100%. When it is equal to 0%, it means that all of the heterogeneity

is caused by sampling error and there is nothing to explain; when it is equal to 100%, the overall

heterogeneity can be accounted for by the true differences between studies exclusively, and thus, it

makes sense to apply sub-group analyses or meta-regressions to identify potential moderating fac-

21
Conducting Meta-Analyses of Proportions in R

tors that can explain the inconsistencies between effect sizes across studies. It is assumed that an

I 2 of 25, 50, and 75% indicate low, medium, and large heterogeneity, respectively (Higgins et al.,

2003). Note that these are only tentative benchmarks for I 2 . The 95% CI around the I 2 statistic

should also be calculated (Cuijpers 2016; Ioannidis et al., 2007). The value of I 2 itself could be

misleading because an I 2 of 0 with a 95% CI ranging from 0 to 80% in a small study is not indica-

tive of homogeneity. Rather, the degree of heterogeneity in this case is uncertain.

5.4 An important caveat: model selection should not be solely based on


heterogeneity tests and statistics.

Together, the Q-statistic, τ 2 , and I 2 can inform us if the effects are homogeneous, or consistent.

When the effect sizes are reasonably consistent, it is appropriate to combine them and give a sum-

mary effect size in reports. In cases where moderate and substantial heterogeneity is present, the

summary effect size is of less or no value. However, in practice, some have adopted the practice

of automatically selecting the random-effects model in the presence of heterogeneity and giving

an overall summary effect despite the fact that the effects are highly inconsistent. In such cases,

we strongly suggest that readers conduct moderator analyses to provide a thorough explanation

of the possible sources of heterogeneity in observed effect sizes instead of placing too much em-

phasis on the mechanistic calculation of a single estimate for the mean effect (Egger et al., 1998).

We will discuss moderator analysis in more detail later. However, as we can see, the methods for

estimating the amount of heterogeneity and the significance tests for heterogeneity are not very

reliable in many cases, and thus may potentially lead to poor estimates and misleading interpre-

tations in terms of the variability of true effects across studies. If we underestimate the between-

study variance and thus underestimate study heterogeneity (or even fail to detect heterogeneity

since tests for this often suffer from low power), authors may risk fitting the wrong model (i.e.,

the fix-effects model) and thus make inaccurate inferences about the overall intervention effect as

well as miss the opportunity to explore and investigate potential sources of systematic variation

between studies, which is in fact one of the core components of a meta-analysis (Thompson, 1994;

Thompson & Sharp, 1999; Higgins & Thompson, 2002). In conclusion, the selection of a model

should be based on a combination of factors, such as the type of conclusion one wishes to draw,

the expectation of the distribution of true effects, the statistical significance of the test for het-

erogeneity, the number of included studies, etc. (Borenstein et al., 2005; Card, 2015; Whitehead,

2002).

22
Conducting Meta-Analyses of Proportions in R

5.5 R code for outputting results of the heterogeneity test and statis-
tics

To view the results of the test for heterogeneity (Q), the estimate of between-study variance (τ 2 ),

and the estimate for the proportion of the observed variability that reflects the between-study

variance (I 2 ), we still use the print() function:

print(pes or pes.logit or pes.da)

The confint() function computes and displays the confidence intervals of τ 2 and I 2 :

confint(pes or pes.logit or pes.da)

To display the output of heterogeneity-related results for the running example, we would type:

print(pes.logit, digits=4)

confint(pes.logit, digits=2)

which outputs:

The output reveals that in the example data, τ 2 is 0.3256 (95% CI=0.3296, 1.4997), I 2 is 97.24%

(95% CI=97.28, 99.39) , and the Q-statistic is 580.5387 (p<0.001), all of which suggests high het-

erogeneity in the effect sizes. Again, the values of τ , τ 2 , and I 2 have fallen out their 95% CIs.

Readers can fix this problem by switching to the 99% CI.

23
Conducting Meta-Analyses of Proportions in R

6 Creating and interpreting forest plots

6.1 Visual inspection of forest plots

A forest plot is a graph that visualizes the point estimates study effects and their confidence inter-

vals (Lewis & Clarke, 2001). It is constructed using two perpendicular lines. The horizontal line

represents the outcome measure, in our case, the proportion. As for the vertical line that inter-

sects the horizontal line, the value at the line is relevant to the statistic being used. Specifically,

in meta-analyses that use relative statistics, such as odds ratio and risk ratio, the vertical line

represents where the intervention has no effect and thus is placed at the value of 1. For absolute

statistics such as absolute risk and standard mean difference, the null difference value is 0. The

two kinds of aforementioned meta-analyses all focus on a treatment effect or a relationship be-

tween two groups. The statistic (i.e., proportion) being estimated in meta-analyses of proportions

is simply a single group summary since a meta-analysis of proportions has only one arm (Boren-

stein et al., 2005). Therefore, in meta-analyses of proportions, the vertical line is usually placed at

the value of the point estimate of the summary proportion, which is depicted as a diamond at the

bottom with the horizontal tips of the diamond representing the 95% confidence interval of the

summary effect. Each study effect plotted on a forest plot has two components to it: a box repre-

senting the point estimate of each study effect and a horizontal line through the box representing

the confidence interval around the point estimate. The size of the box gives a representation of

the sample size or the weighting of each study, meaning the bigger the sample size, the bigger the

box, the shorter the horizontal line. A shorter horizontal line suggests a better precision of the

study results. A bigger box with a shorter horizontal line indicates that the study has more im-

pact on the summary effect size (Anzures-Cabrera & Higgins, 2010). The forest plot is very useful

for understanding the nature of the data being analyzed because it provides a simple visual repre-

sentation of the amount of variation between effect sizes across studies. Study effects are regarded

as homogeneous if the horizontal lines of all studies overlap (Ried, 2006; Petrie et al., 2003). The

forest plot also allows us to detect outliers (Cuijpers, 2016). This can be achieved by identifying

studies whose 95% confidence intervals do not overlap with that of the summary effect. Further-

more, it is worth noting that if large studies are outliers then the overall heterogeneity could be

high.

24
Conducting Meta-Analyses of Proportions in R

6.2 Identifying outliers with formal tests

It is crucial to conduct several formal tests to determine if the outlying effect sizes identified by

examining the forest plot are truly outliers. If they are considered outliers, further investigation

is needed to determine whether or not they are actually influential to the overall effect size. A di-

agnostic plot known as the Baujat plot (Baujat et al., 2002) has been proposed to identify studies

that contribute to heterogeneity in meta-analytic data. The horizontal axis represents the con-

tribution of each study to the Q-statistic for heterogeneity. The vertical axis illustrates the in-

fluence of each study on the summary effect size. Studies that appear in the top right quadrant

of the graph contribute most to both of these factors. Outlying effect sizes can also be identi-

fied by screening for externally studentized residuals that are larger than 2 or 3 in absolute value

(Tabachnik & Fidell, 2013; Viechtbauer and Cheung, 2010). An outlying effect size, however, may

not be considered to be influential unless its exclusion leads to significant changes in the fitted

meta-analytic model and exerts considerable influence on the summary effect size. Viechtbauer &

Cheung (2010) have proposed a set of case deletion diagnostics derived from linear regression anal-

yses to identify influential studies, such as difference in fits values (DFFITS), Cook’s distances,

leave-one-out estimates for the amount of heterogeneity (i.e., τ 2 ) as well as the test statistic for

heterogeneity (i.e., Q-statistic). In leave-one-out analyses, each study is removed in turn one at a

time, and the summary proportion is re-estimated based on the remaining n-1 studies. Analyzed

as such, we are able to assess the influence that each study has on the summary proportion.

As a final note, instead of simply eliminating studies that yield outlying effect sizes, one should

investigate these outliers and influential cases fully to understand their occurrence. Often they

could reveal valuable study characteristics that can be used as potential moderating variables to

account for some of the between-study variability.

6.3 R code for creating forest plots

Many authors of meta-analysis of proportions fail to create forest plots in a proper way when a

meta-analysis contains subgroups of studies. Specifically, many published meta-analytic studies

have failed to display correct estimates for overall and subgroup summary proportions in their for-

est plots. Simply put, let us say that a group of studies are divided into two subsets of studies in

a subgroup analysis. The inappropriate approach to calculating the subgroup summary propor-

tions is to assume that the two subgroups of studies are two sets of studies independent of each

other. In fact, it is quite possible that they share a common between-study variance component,

25
Conducting Meta-Analyses of Proportions in R

so a common estimate of τ 2 needs to be applied to each study when calculating subgroup and

overall summary proportions. The summary proportion across all studies calculated in the pres-

ence of subgroups is also different than that derived in the absence of subgroups. We will discuss

this issue in more detail in the next section.

In this section, we will begin with learning how to create a basic forest plot using the meta pack-

age. We will show readers how to create a more difficult forest plot in the next section after a

thorough discussion on the aforementioned issue.

We can create a simple forest plot that does not have any subgroups with the following generic

code (assuming that we have loaded the meta package):

pes.summary=metaprop(cases, total, authoryear, data=dat, sm="PRAW"/"PLO"/"PFT")

forest(pes.summary)

The sm= argument in the metaprop() function dictates which transformation method will be used

to convert the original proportions:

PRAW #no transformation

PLO #the logit transformation

PFT #the double arcsine transformation

Forest plots created by the generic code are bare-boned and may fail to meet publishing standards

in many cases. The following code can produce publication-quality forest plots using the data in

the running example:

pes.summary=metaprop(cases, total, authoryear, data=dat, sm="PLO",

method.tau="DL", method.ci="NAsm")

forest(pes.summary,

xlim=c(0,4),

pscale=1000,

rightcols=FALSE,

leftcols=c("studlab", "event", "n", "effect", "ci"),

leftlabs=c("Study", "Cases", "Total", "Prevalence", "95% C.I."),

xlab="Prevalence of CC", smlab="",

weight.study="random", squaresize=0.5, col.square="navy",

col.square.lines="navy",

col.diamond="maroon", col.diamond.lines="maroon",

26
Conducting Meta-Analyses of Proportions in R

pooled.totals=FALSE,

comb.fixed=FALSE,

fs.hetstat=10,

print.tau2=TRUE,

print.Q=TRUE,

print.pval.Q=TRUE,

print.I2=TRUE,

digits=2)

which produces:

It should be mentioned that given space constraints we have only listed the most essential argu-

ments in the forest() function to create a forest plot. Readers are referred to the instruction of the

meta package to explore more useful arguments to customize their own forest plots.

We can order the individual studies by precision in order to help us visually inspect the nature of

the data and examine publication bias. This can be achieved by using SE or the inverse of SE as

the measure of accuracy and creating an object to store their estimates:

precision=sqrt(ies.logit$vi) or 1/sqrt(ies.logit$vi)

And then we add the sortvar= argument in the forest() function:

sortvar=precision

27
Conducting Meta-Analyses of Proportions in R

which produces:

This graph clearly shows that the CC prevalence are larger when studies are smaller and less pre-

cise. In meta-analyses of comparative studies (i.e., meta-analytic studies that use OR, RR, etc. as

effect size), what we would like to see is an even spread of studies with varying precision on either

side of the mean effect size because it indicates a lack of publication bias in most cases. However,

in a meta-analysis of observational data (e.g., proportions), an uneven spread of studies may ac-

tually reflect a genuine pattern in effect sizes instead of publication bias, especially when small

studies fall out to the right side of the mean (the reasons are explained in detail in the next sec-

tion). It is also possible that some small studies are not published due to legitimate reasons, such

as the use of poor methods. Thus, this uneven distribution of effects is certainly worth investigat-

ing further, which may provide new insight into the topic of interest.

Notice that the estimate of τ 2 is 0.33 in the absence of subgroups.

A visual inspection of the forest plot identifies several suspicious outlying studies, including Bermejo

(1998), SanGiovanni (2002), and Halilbasic (2014).

6.4 R code for identifying outlying and influential studies

Next, we have to conduct a few formal tests to confirm our visual inspection of the forest plot.

The first test involves screening for externally studentized residuals that are larger than 2 or 3 in

absolute value. Using the following generic code, studies will be shown in descending order accord-

ing to their residual estimates:

28
Conducting Meta-Analyses of Proportions in R

stud.res=rstudent(pes/pes.logit/pes.da)

abs.z=abs(stud.res$z)

stud.res[order(-abs.z)]

Performing the test on the running example outputs:

The key here is to find studies with z-values that are bigger than 2 or 3 (depending on the num-

ber of studies included). Since we only have 17 studies in the running example, we would set the

cut-off at 2, thus the second, eighth, and twelfth studies are chosen. They match the studies we

visually identified earlier.

Outliers have the potential to be influential, but we generally have to investigate further to deter-

mine whether or not they are truly influential. This can be achieved by performing a set of leave-

one-out diagnostic tests:

Option 1: no transformation

L1O=leave1out(pes); print(L1O)

Option 2: the logit transformation

L1O=leave1out(pes, transf=transf.ilogit); print(L1O)

Option 3: the double-arcsine transformation

L1O=leave1out(pes, transf=transf.ipft.hm, targ=list(ni=dat$total)); print(L1O)

Using the example data, we first execute the following code:

L1O=leave1out(pes.logit, transf=transf.ilogit); print(L1O, digits=6)

which outputs:

29
Conducting Meta-Analyses of Proportions in R

The numbers in the printed dataset look daunting at first. They are actually very easy to inter-

pret. For instance, the first estimate in the first column (i.e., 0.000419) is the estimate for the

summary proportion that is derived when we take the first study out of this set of studies and re-

calculate the overall mean. In other words, if the first study is left out of this set of studies, the

estimate for the observed summary proportion (0.000424) will become 0.000419.

We can actually visualize the change in the summary effect size with a forest plot using metafor.

The generic code is given below:

Option 1: no transformation

l1o=leave1out(pes)

yi=l1o$ estimate; vi=l1o$se^2

forest(yi, vi,

slab=paste(dat$author, dat$year, sep=","),

refline=pes$b,

xlab="Summary proportions leaving out each study")

Option 2: the logit transformation

l1o=leave1out(pes.logit)

yi=l1o$estimate; vi=l1o$se^2

forest(yi, vi, transf=transf.ilogit,

slab=paste(dat$author, dat$year, sep=","),

refline=pes$pred,

xlab="Summary proportions leaving out each study")

Option 3: the double-arcsine transformation

l1o=leave1out(pes.da)

yi=l1o$estimate; vi=l1o$se^2

30
Conducting Meta-Analyses of Proportions in R

forest(yi, vi, transf=transf.ipft.hm, targ=list(ni=dat$total),

slab=paste(dat$author, dat$year, sep=","),

refline=pes$pred,

xlab="Summary proportions leaving out each study")

The forest plot below is generated with the data in the running example. Each box represents a

summary proportion estimated leaving out a study. The reference line indicates where the original

summary proportion lies. From the graph we can deduce that the further a box deviates from the

reference line, the more pronounced the impact of the corresponding missing study will be on the

original summary proportion.

With these potential influential studies in mind, we now conduct a few more leave-one-out diag-

nostics with a built-in function in metafor to verify our guesses:

inf=influence(pes/pes.logit/pes.da)

print(inf); plot(inf)

31
Conducting Meta-Analyses of Proportions in R

which outputs:

We have actually described these tests earlier, so we will not repeat here. The plot below the

printed dataset visualizes the leave-one-out estimates. Influential studies are marked with an as-

terisk in the printed dataset and labeled in red in the plot. The second and eighth studies fulfill

the criteria as influential studies.

6.5 R code for removing outlying studies

Once all possible outliers are determined, we can remove them with the following generic code:

#remember to add "add=0" when using double-arcsine transformation

ies.noutlier=escalc(xi=cases, ni=total,

measure="PR"/"PLO"/"PFT",

data=dat[-c(study label 1, study label 2,),])

32
Conducting Meta-Analyses of Proportions in R

If we were to exclude the second and the eighth studies, we would execute the following code:

ies.logit.noutlier=escalc(xi=cases, ni=total, measure="PLO", data=dat[-c(2, 8),])

pes.logit.noutlier=rma(yi, vi, data=ies.logit.noutlier, method="DL")

pes.noutlier=predict(pes.logit.noutlier, transf=transf.ilogit)

print(pes.noutlier, digits=5)

7 Explaining heterogeneity with moderator analyses

7.1 Overview of moderator analysis

We have determined that heterogeneity do exist in our data and identified a few outlying studies

that contribute some variability to part of the heterogeneity. If substantial heterogeneity remains

after we exclude those outliers, a major way to discover other possible sources of heterogeneity is

through moderator analysis. In fact, a thorough moderator analysis is more informative than a

single estimate of summary effect size when meta-analytic data being examined contain substan-

tial heterogeneity. Similar to primary studies, moderator analyses have a sample of “participants”

(i.e., the studies included in a meta-analysis), one or multiple independent variables (i.e., moderat-

ing variables) and one dependent variable (i.e., effect sizes within each subgroup). To predict the

effect of a hypothesized moderator, we apply a weighted linear regression model in which the ef-

fect sizes (i.e., logit-or double-arcsine transformed proportions) are regressed onto the moderator

(Card, 2015; Thompson & Higgins, 2002):

ES = β0 + Cβ1 + e (16)

where C is the regression (slope) coefficient, β1 is the moderator effect, and e is the within-study

variance. β0 is more informative when testing categorical variables as potential moderators; in

such a situation, it is the mean effect of a reference category (i.e., the category coded as 0 in a

dummy variable). It is not necessary to know the mathematics behind the process, but keeping

the regression model formula in mind will be useful when we interpret the resulting output of

moderator analysis in R.

33
Conducting Meta-Analyses of Proportions in R

7.2 Subgroup analysis

There are two major forms of moderator analyses: subgroup analysis and meta-regression. Sub-

group analysis can be considered as a special form of meta-regression in which only one categor-

ical moderator is examined (Thompson & Higgins, 2002). Generally, subgroup analyses are con-

ducted with a mixed effects model in which the random-effects model is used to combine study

effects within each subgroup and the fixed-effect model is used to test whether the effects across

the subgroups vary significantly from each other (Cuijpers, 2016). Subgroup analysis uses analog

to the analysis of variance (ANOVA) to evaluate the impact of moderators on effect sizes (Lit-

tell et al., 2008). Under the framework of subgroup analysis, the total set of studies is split into

two or more subgroups according to the categories within a categorical moderator and we com-

pare the effect in one subgroup of studies versus that in the rest of the subgroup(s) of studies. In

essence, categorical moderators are study characteristics that can account for a certain proportion

of between-study variability (Hamza et al., 2008). For instance, if a subgroup of studies shares a

common characteristic that the other subgroup(s) does not (e.g., being exposed to a new treat-

ment vs. old treatment) and the effect sizes across the subgroups differ significantly, it is quite

possible that part of the variation can be attributed to this particular study characteristic. In sub-

group analysis, the summary effect size for each group as well as the within-group heterogeneity

are obtained. Moreover, a Wald-type test is conducted to compare the summary effect sizes across

subgroups: using either a Z -score or a Q-statistic (both yield the same p-value), whether or not

two groups have significantly different outcomes can be determined (Viechtbauer, 2010).

7.3 An important caveat regarding obtaining an overall summary effect


size in the presence of subgroups

Within each subgroup of studies, the summary effect size can be calculated under fixed-effect and

random-effects models. One thing to be noted here is the computation of the between-study vari-

ance, τ 2 (Borenstein et al., 2005). As is the case for a fixed-effect meta-analysis in the absence of

moderators, the between-study variance within each subgroup is assumed to be zero. By contrast,

when we compute the summary effect size for each subgroup under the random-effects model, the

value of τ 2 needs to be estimated within subgroups of studies rather than across all studies as we

want to compare the summary effect size and within-group variance of each subgroup. Once we

have obtained the estimated values of τ 2 yielded within the subgroups, we can choose either to

pool them or not, depending on whether we anticipate the true between-study variation in ef-

34
Conducting Meta-Analyses of Proportions in R

fect sizes within each subgroup to be the same. If we believe that the observed within-group τ 2

estimates are different from one study to the next due to sampling error alone, then we antici-

pate a common τ 2 across subgroups, and thus we should apply the estimate of the pooled τ 2 to

all the studies. On the other hand, if we believe that apart from sampling errors, some system-

atic causes are also responsible for the differential values of the observed within-group τ 2 , then

we apply separate estimates of τ 2 for each subgroup. Simply put, if we use a separate estimate of

between-study variance, that means we are actually conducting an independent meta-analysis for

each subgroup ignoring subgrouping. When we assume that τ 2 is the same for all subgroups un-

der the random-effects model, we can use the R index to represent the proportion of the between-

study variance across all studies that can be explained by a moderator. In the presence of sub-

groups, the estimate of the summary proportion for all studies will be different than that in the

absence of subgroups (Borenstein et al., 2005). This is because we use different estimates for τ 2 in

different cases. In the absence of subgroups, τ 2 is computed based on the dispersion of all studies

from the grand mean (Borenstein et al., 2005). In the present of subgroups, as we have just dis-

cussed above, we have two methods to compute τ 2 . Therefore, there will be 3 different estimates

for summary effect size according to which method you would use. Computing the different kinds

of summary proportion estimates depends on the goal of one’s study as well as the nature of the

meta-analytic data, but any differences among these estimates will usually be trivial. Given space

constraints, this issue will not be discussed any further here, but see the code below to learn how

different variants of estimates for summary proportion are derived in R.

7.4 Meta-regression

Just as subgroup analysis relied on an adaptation of ANOVA, the logic of evaluating moderators

in meta-regression parallels the use of regression or multiple regression in primary studies (Card,

2015). Meta-regression examines the relationship between covariates (i.e., moderators) and the ef-

fect sizes in a set of studies using the study as the unit of analysis (Borenstein et al., 2005; Lit-

tell et al., 2008), under the framework of which moderating variables can either be categorical

or continuous. That is, we can incorporate a single continuous variable (e.g., the average age of

participants, sample size, publication year, number of therapy sessions in treatment, etc.), a se-

rious of continuous variables, or a combination of continuous and categorical variables as mod-

erators. When categorical moderators are included in a meta-regression model, they should be

coded as “dummy” variables. In a multivariate model of meta-regression, due to potential multi-

35
Conducting Meta-Analyses of Proportions in R

collinearity resulting from interrelated moderators, Hox (2010) suggests that moderating variables

should be evaluated separately in univariate models prior to being tested simultaneously in a sin-

gle model. As is true in the random-effects subgroup analysis, the R2 index can be employed in

meta-regression to indicate the proportion of true heterogeneity across all studies that can be ac-

counted for by one or a set of moderators in order to quantify the magnitude of their impact on

study effects.

7.5 Visualizing moderator analysis: Scatter plots

We can visualize moderator analysis by producing a scatter plot of the moderating variable(s) and

the effect sizes. A scatter plot is constructed with a center line known as the regression line and

two curved lines showing a 95% confidence interval around it, with studies represented by a circle

drawn proportional to its study weight (i.e., larger studies are shown as larger circles). What is

important in a scatter plot is the slope of the regression line. A completely horizontal regression

line implies that there is no association between the moderator and the effect sizes. If, however,

the regression line is not horizontal, this indicates that the effect sizes vary as the value of the

moderator changes. The slope coefficient and its significance test can inform us if the slope signif-

icantly deviates from zero. A significantly positive or negative slope suggests that the explanatory

variable has a significant moderating effect and can explain a significant amount of heterogeneity.

7.6 An important caveat: Results of moderator analyses cannot be seen


as causal evidence

Moderator analysis has several limitations. A main one being that both subgroup analysis and

meta-regression require a large ratio of studies to moderators. In general, moderator analysis should

not be attempted unless at least 10 studies are available for each moderator in the analysis, espe-

cially in a multivariate model where the number of studies could be small, leading to reduced sta-

tistical power (Higgins & Green, 2006; Littell et al., 2008). Perhaps the most important limitation

is that the significant differences found between subgroups of studies in moderator analyses cannot

be seen as causal evidence. It is quite possible that unidentified factors that are not measured in

such moderator analyses are responsible for the differential effect sizes across the subgroups. Un-

fortunately, there is no solution for this problem (Cuijpers, 2016; Littell et al., 2008). Hence, one

cannot draw causal conclusions from moderator analyses. In light of this, we strongly suggest that

authors choose moderating variables based on theoretical reasoning and only test those for which

36
Conducting Meta-Analyses of Proportions in R

a strong theoretical case can be made in order to avoid erroneously attributing heterogeneity to

spurious moderators found by chance (Schmidt & Hunter, 2014).

7.7 R code for calculating subgroup summary proportions, conduct sub-


group analysis, and recalculate the overall summary proportion

Based on the analysis above, we have developed the following generic code to help readers perform

subgroup analyses and compute (overall and within-group) summary proportions appropriately. In

order to select the appropriate computational option, readers need to develop a good understand-

ing of the nature of the data they are working with.

In the first situation, we do not assume a common between-study variance component across sub-

groups and thus do not pool within-group estimates of τ 2 . To allow us to examine the moderating

effect of a potential moderator variable with a mixed-effect model and to recalculate a new over-

all summary effect size using separate τ 2 within each subgroup, we first fit two separate random-

effects models within each subgroup and then combine the estimated statistics from each model

into a data frame. Finally, we fit a fixed-effect model to compare the two estimated logit trans-

formed proportions and calculate a new summary effect size. The generic code is provided here:

Assumption 1: Do not assume a common between-study variance component (do not pool

within-group estimates of between-study variance)

Option 1: no transformation

pes.subgroup1=rma(yi, vi, data=ies, subset=moderator=="subgroup1")

pes.subgroup2=rma(yi, vi, data=ies, subset=moderator=="subgroup2")

dat.diffvar=data.frame(estimate=c(pes.subgroup1$b, pes.subgroup2$b),

stderror=c(pes.subgroup1$se, pes.subgroup2$se),

moderator=c("subgroup1", "subgroup2"),

tau2=round(c(pes.subgroup1$tau2, pes.subgroup2$tau2), 3))

subganal.moderator=rma(estimate, sei=stderror, mods=~moderator,

method="FE", data=dat.diffvar)

pes.moderator=rma(estimate, sei=stderror, method="FE", data=dat.diffvar)

pes.moderator=predict(pes.moderator)

print(pes.subgroup1) #display subgroup 1 summary effect size

print(pes.subgroup2) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

37
Conducting Meta-Analyses of Proportions in R

Option 2: the logit transformation

pes.logit.subgroup1=rma(yi, vi, data=ies.logit, subset=moderator=="subgroup1")

pes.logit.subgroup2=rma(yi, vi, data=ies.logit, subset=moderator=="subgroup2")

pes.subgroup1=predict(pes.logit.subgroup1, transf=transf.ilogit)

pes.subgroup2=predict(pes.logit.subgroup2, transf=transf.ilogit)

dat.diffvar=data.frame(estimate=c(pes.logit.subgroup1$b, pes.logit.subgroup2$b),

stderror=c(pes.logit.subgroup1$se, pes.logit.subgroup2$se),

moderator=c("subgroup1", "subgroup2"),

tau2=round(c(pes.logit.subgroup1$tau2,

pes.logit.subgroup2$tau2), 3))

subganal.moderator=rma(estimate, sei=stderror, mods=~moderator,

method="FE", data=dat.diffvar)

pes.logit.moderator=rma(estimate, sei=stderror, method="FE", data=dat.diffvar)

pes.moderator=predict(pes.logit.moderator, transf=transf.ilogit)

print(pes.subgroup1); print(pes.logit.subgroup1) #display subgroup 1 summary effect size

print(pes.subgroup2); print(pes.logit.subgroup2) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

Option 3: the double-arcsine transformation

pes.da.subgroup1=rma(yi,vi,data=ies.da, subset=moderator=="subgroup1")

pes.da.subgroup2=rma(yi,vi,data=ies.da, subset=moderator=="subgroup2")

pes.subgroup1=predict(pes.da.subgroup1, transf=transf.ipft.hm,targ=list(ni=dat$total))

pes.subgroup2=predict(pes.da.subgroup2, transf=transf.ipft.hm,targ=list(ni=dat$total))

dat.diffvar=data.frame(estimate=c(pes.da.subgroup1$b, pes.da.subgroup2$b),

stderror=c(pes.da.subgroup1$se, pes.da.subgroup2$se),

moderator=c("subgroup1", "subgroup2"),

tau2=round(c(pes.da.subgroup1$tau2,

pes.da.subgroup2$tau2), 3))

subganal.moderator=rma(estimate, sei=stderror, mods=~moderator,

method="FE", data=dat.diffvar)

pes.da.moderator=rma(estimate, sei=stderror, method="FE", data=dat.diffvar)

pes.moderator=predict(pes.da.moderator, transf=transf.ipft.hm, targ=list(ni=dat$total))

print(pes.subgroup1); print(pes.da.subgroup1) #display subgroup 1 summary effect size

38
Conducting Meta-Analyses of Proportions in R

print(pes.subgroup2); print(pes.da.subgroup2) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

In the second situation, we assume a common between-study variance component across sub-

groups and pool within-group estimates of τ 2 . In this case, we can directly use the rma() com-

mand to fit a mixed-effect model to evaluate the moderating effect of a potential predictor. How-

ever, to allow us to calculate a new overall summary proportion using a pooled τ 2 across all stud-

ies, we still have to combine a new data frame containing statistics estimated in two random-

effects models. Once we have created the new data frame, we can calculate a new overall summary

effect based on the data frame using a fixed-effect model or a random-effects model (based on the

various factors we have discussed above, e.g., the conclusion one wishes to make, the true distribu-

tion of effect sizes, etc.).

Assumption 2: Assume a common between-study variance component (pool within-group

estimates of between-study variance)

Option 1: no transformation

subganal.moderator=rma(yi, vi, data=ies, mods=~moderator)

pes.subgroup1=rma(yi, vi, data=ies, mods=~moderator=="subgroup2")

pes.subgroup2=rma(yi, vi, data=ies, mods=~moderator=="subgroup1")

pes.subg.moderator=predict(subganal.moderator)

dat.samevar=data.frame(estimate=c((pes.subgroup1$b)[1], (pes.subgroup1$b)[1]),

stderror=c((pes.subgroup2$se)[1], (pes.subgroup2$se)[1]),

tau2=subganal.moderator$tau2)

pes.moderator=rma(estimate, sei=stderror,

method="FE or other RE estimators",

data=dat.samevar)

pes.moderator=predict(pes.moderator)

print(pes.subg.moderator[study label 1]) #display subgroup 1 summary effect size

print(pes.subg.moderator[study label 2]) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

Option 2: the logit transformation

subganal.moderator=rma(yi, vi, data=ies.logit, mods=~moderator)

39
Conducting Meta-Analyses of Proportions in R

pes.logit.subgroup1=rma(yi, vi, data=ies.logit, mods=~moderator=="subgroup2")

pes.logit.subgroup2=rma(yi, vi, data=ies.logit, mods=~moderator=="subgroup1")

pes.subg.moderator=predict(subganal.moderator, transf=transf.ilogit)

dat.samevar=data.frame(estimate=

c((pes.logit.subgroup1$b)[1],(pes.logit.subgroup2$b)[1]),

stderror=

c((pes.logit.subgroup1$se)[1],(pes.logit.subgroup2$se)[1]),

tau2=subganal.moderator$tau2)

pes.logit.moderator=rma(estimate, sei=stderror,

method="FE or other RE estimators",

data=dat.samevar)

pes.moderator=predict(pes.logit.moderator, transf=transf.ilogit)

print(pes.subg.moderator[study lable 1]) #display subgroup 1 summary effect size

print(pes.subg.moderator[study lable 2]) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

Option 3: the double-arcsine transformation

subganal.moderator=rma(yi, vi, data=ies.da, mods=~moderator)

pes.da.subgroup1=rma(yi, vi, data=ies.da, mods=~moderator=="subgroup2")

pes.da.subgroup2=rma(yi, vi, data=ies.da, mods=~moderator=="subgroup1")

pes.subg.moderator=predict(subganal.moderator,

transf=transf.ipft.hm,

targ=list(ni=dat$total))

dat.samevar=data.frame(estimate=c((pes.da.subgroup1$b)[1], (pes.da.subgroup2$b)[1]),

stderror=c((pes.da.subgroup1$se)[1], (pes.da.subgroup2$se)[1]),

tau2=subganal.moderator$tau2)

pes.da.moderator=rma(estimate, sei=stderror,

method="FE or other RE estimators",

data=dat.samevar)

pes.moderator=predict(pes.da.moderator, transf=transf.ipft.hm, targ=list(ni=dat$total))

print(pes.subg.moderator[study lable 1]) #display subgroup 1 summary effect size

print(pes.subg.moderator[study lable 2]) #display subgroup 2 summary effect size

print(subganal.moderator) #display subgroup analysis results

print(pes.moderator) #display recomputed summary effect size

40
Conducting Meta-Analyses of Proportions in R

To help readers better understand how to use the code, we will now illustrate its implementa-

tion with the running example. For demonstrative purposes, we will use the variable study design

(birth cohort vs. others) as the moderator and conduct the analysis with the logit transformation

under both situations and then compare the resulting estimates of the overall summary propor-

tions with that of the original one.

In the first situation, execute the following code:

pes.logit.birthcohort=rma(yi, vi, data=ies.logit,

subset=studydesign=="Birth cohort",

method="DL")

pes.logit.others=rma(yi, vi, data=ies.logit,

subset=studydesign=="Others",

method="DL")

pes.birthcohort=predict(pes.logit.birthcohort, transf=transf.ilogit, digits=5)

pes.others=predict(pes.logit.others, transf=transf.ilogit, digits=5)

dat.diffvar=data.frame(estimate=c(pes.logit.birthcohort$b, pes.logit.others$b),

stderror=c(pes.logit.birthcohort$se, pes.logit.others$se),

studydesign=c("Birth cohort", "Others"),

tau2=round(c(pes.logit.birthcohort$tau2,

pes.logit.others$tau2), 3))

subganal.studydesign=rma(estimate, sei=stderror, data=dat.diffvar,

mods=~studydesign, method="FE")

pes.logit.studydesign=rma(estimate, sei=stderror, method="FE", data=dat.diffvar)

pes.studydesign=predict(pes.logit.studydesign, transf=transf.ilogit)

print(pes.birthcohort, digits=6); print(pes.logit.birthcohort, digits=3)

print(pes.others, digits=6); print(pes.logit.others, digits=3)

print(subganal.studydesign, digits=3)

print(pes.studydesign, digits=6)

which outputs:

From the output above, we can derive that the summary effect estimates are 0.00035 (95% CI=0.00016,

0.00078), 0.00047 (95% CI=0.00034, 0.00065), and 0.00045 (95% CI=0.00034, 0.00061) for the

two subgroups and the overall group of studies, respectively. When we fit separate random-effects

models in the two subgroups, we decide to allow the amount of variance within each set of stud-

ies to be different, which results in two different within-group estimates of τ 2 (0.93 and 0.25 for

41
Conducting Meta-Analyses of Proportions in R

studies using the birth cohort design and other study designs, respectively). That means studies

within each subgroup share the same estimate of τ 2 . The results of the test of moderators reveals

that the difference between the two subgroup summary estimates is not significant (QM (1)=0.45,

p=0.51) despite the fact that the estimate of the second subgroup is larger than the first. Note

that the sum of the within-group heterogeneity across the subgroups in the fixed-effect model is

equal to QE (0)=0, p=1. This is because the within-group heterogeneity has been accounted for

in each subgroup (Q(df =5)=344.594, p<0.001; Q(df =10)=235.944, p<0.01, respectively) in the

random-effects model, thus there is no heterogeneity left to be accounted for (which is also the

definition of residual heterogeneity).

In the second situation, execute the following code:

subganal.studydesign=rma(yi, vi, data=ies.logit, mods=~studydesign, method="DL")

pes.logit.birthcohort=rma(yi, vi, data=ies.logit, mods=~studydesign=="Others",

method="DL")

42
Conducting Meta-Analyses of Proportions in R

pes.logit.others=rma(yi, vi, data=ies.logit, mods=~studydesign=="Birth cohort",

method="DL")

pes.subg.studydesign=predict(subganal.studydesign, transf=transf.ilogit)

dat.samevar=data.frame(estimate=

c((pes.logit.birthcohort$b)[1],(pes.logit.others$b)[1]),

stderror=

c((pes.logit.birthcohort$se)[1], (pes.logit.others$se)[1]),

tau2=subganal.studydesign$tau2)

pes.logit.studydesign=rma(estimate, sei=stderror, method="DL", data=dat.samevar)

pes.studydesign=predict(pes.logit.studydesign, transf=transf.ilogit)

print(subganal.studydesign, digits=4)

print(pes.subg.studydesign[1], digits=6)

print(pes.subg.studydesign[17], digits=6)

print(pes.studydesign, digits=6)

which outputs:

This output is fairly self-explanatory. From this output, we can derive that we have fitted a mixed-

effects model, meaning a random-effects model is used to combine studies within each subgroup

and a fixed-effect model is used to combine subgroups and produce the estimate for the summary

effect size. The amount of within-group heterogeneity across the two subgroups is assumed to be

the same (τ 2 =0.042 in this case). It is the combined estimate yielded by pooling the estimates

43
Conducting Meta-Analyses of Proportions in R

of the two within-group variance as displayed earlier (τ 2 =0.93 and τ 2 =0.25). Once we have the

pooled estimate, we then apply it to each study across the two subgroups, meaning every study

now shares the same estimate of τ 2 (i.e., 0.042). From the test of moderators section, we can de-

rive that the moderator study design does not have a moderating effect (QM (1)=0.92, p=0.34).

In other words, it cannot explain the true heterogeneity in the effect sizes. That is, when we di-

vide the included studies according to their study design, we will fail to find significant difference

between the two subgroups of effect sizes. This conclusion can also be supported by the result of

the test for residual heterogeneity: there is significant unexplained heterogeneity left between all

effect sizes in the data (QE (15)=580.54, p<0.01) after the study design has been added to the

mixed-effects model to examine its potential moderating effect, which can also explain why R2

shows 0%, suggesting that the study design can explain 0% of the between-study variance, namely

the true heterogeneity. It is worth noting that under the framework of the mixed-effect model,

the residual heterogeneity estimate here (QE (15)=580.54) is the sum of the two within-group

heterogeneity estimates we have obtained above in the random-effects model (Q(df =5)=344.59,

Q(df =10)=235.94, respectively). Finally, the estimates for the two subgroup summary propor-

tions and the overall summary proportion are displayed at the bottom of the output. They are

0.00034 (95% CI=0.0002, 0.00061), 0.00049 (95% CI=0.00032, 0.00074), and 0.00043 (95% CI=0.00031,

0.0006), respectively. There are several other points that are worth noting. When we code dummy

variables, the subset of studies coded as 0 in a dummy variable will function as the reference group

(represented by the intercept of the fitted mixed-effects regression model).The other subset of

studies coded as 1 will be compared against the reference group (in the running example, “birth

cohort” is the reference group). In fact, it makes no difference which subset is selected as the ref-

erence group from a statistical point of view. The estimate of the intercept (i.e., -7.97) is equal

to the logit-transformed summary effect estimate of the studies in the reference group (i.e., birth

cohort). The regression coefficient (i.e., the slope) is 0.35.

We can see that even though the conclusion drawn from the two computational models can be the

same (i.e., study design is not a significant moderating variable), when we calculate the summary

effect estimate for the overall set of studies, the estimates vary according to the τ 2 that is applied

in different situations (in the presence of subgroups vs. in the absence of subgroups). In general, if

the sample size of each subgroup is small, it is recommended to pool the separate τ 2 . In so doing

we can obtain a more accurate estimate of τ 2 . In contrast, if we decide not to pool them, we need

at least 5 studies in each subgroup to be able to yield a moderately stable estimate of τ 2 within

each subgroup (Borenstein et al., 2005).

44
Conducting Meta-Analyses of Proportions in R

7.8 R code for creating forest plots in the presence of subgroups

We have shown readers how to create forest plots without subgroups. We will now begin con-

structing forest plots with subgroups under different assumptions (a common between-study vari-

ance component vs. separate between-study variance components). Constructing forest plots using

metafor could be challenging even for experienced metafor users. Fortunately, we have obtained

the estimates for subgroup and overall summary proportions in the previous section, which are

needed to create our forest plot. We simply need to copy those numbers and paste them into the

forest plot being constructed. The following is the generic code to construct forest plots under the

first assumption (which also corresponds to the first assumption in the previous section):

Assumption 1: Do not assume common between-study variance component (use separate

within-group estimates of between-study variance).

Option 1: no transformation

ies.summary=summary(ies, ni=dat$total)

forest(ies.summary$yi, ci.lb=ies.summary$ci.lb, ci.ub=ies.summary$ci.ub,

rows=c(d:c, b:a))

Option 2: the logit transformation

ies.summary=summary(ies.logit, transf=transf.ilogit, ni=dat$total)

forest(ies.summary$yi, ci.lb=ies.summary$ci.lb, ci.ub=ies.summary$ci.ub,

rows=c(d:c, b:a))

Option 3: the double arcsine transformation

ies.summary=summary(ies, transf=transf.ipft, ni=dat$total)

forest(ies.summary$yi, ci.lb=ies.summary$ci.lb, ci.ub=ies.summary$ci.ub,

rows=c(d:c, b:a))

The code above merely builds the “bones” of a forest plot. More components need to be added to

the plot (e.g., text, headers, labels, etc.). We also have to manually adjust the appearance of the

plot to make it look prettier and more professional. Dividing a set of included studies into sev-

eral subgroups in a forest plot using metafor has to be done manually with the rows=argument.

Readers may have noticed that the parameters (a, b, c, and d denotes a particular position on the

Y -axis, respectively) in the argument are ordered from right to left. a specifies the vertical posi-

tion for plotting the first study in the first subgroup; b specifies the vertical position for plotting

45
Conducting Meta-Analyses of Proportions in R

the last study in the first subgroup; c specifies the vertical position for plotting the first study in

the second subgroup; d specifies the vertical position for plotting the last study in the second sub-

group. Mathematically speaking, b − a + 1 and d − c + 1 should be equal to the number of studies in

their corresponding subgroups. c and b do not need to be consecutive numbers. If we order these

parameters from left to right, studies will be displayed in reverse order with the first study being

displayed at the bottom of the plot and the last study being displayed at the top of all the stud-

ies.

To illustrate, we can execute the following code to create a forest plot using study design as mod-

erator.

ies.summary=summary(ies.logit, transf=transf.ilogit, ni=dat$total)

par(cex=1, font=6)

forest(ies.summary$yi,

ci.lb=ies.summary$ci.lb, ci.ub=ies.summary$ci.ub,

ylim=c(-5, 23),

xlim=c(-0.005, 0.005),

slab=paste(dat$author, dat$year, sep=","),

ilab=cbind(data=dat$cases, dat$total),

ilab.xpos=c(-0.0019, -0.0005),

ilab.pos=2,

rows=c(19:14, 8.5:-1.5),

at=c(seq(from=0, to=0.004, by=0.001)),

refline=pes.studydesign$pred,

main="",

xlab="Proportion(%)",

digits=4)

par(cex=1.2, font=7)

addpoly(pes.birthcohort$pred, ci.lb=pes.birthcohort$ci.lb, ci.ub=

pes.birthcohort$ci.ub, row=12.8, digits=5)

addpoly(pes.others$pred, ci.lb=pes.others$ci.lb, ci.ub=pes.others$ci.ub, row=-2.7,

digits=5)

addpoly(pes.studydesign$pred, ci.lb=pes.studydesign$ci.lb, ci.ub=

pes.studydesign$ci.ub, row=-4.6, digits=5)

par(cex=1.1, font=7)

text(-0.005, 21.8, pos=4, "Study")

46
Conducting Meta-Analyses of Proportions in R

text(c(-0.0026, -0.0014), 21.8, pos=4, c("Cases", "Total"))

text(0.0025, 21.8, pos=4, "Proportion [95% CI]")

text(-0.005, c(9.7, 20.2), pos=4, c("Others", "Birth cohort"))

par(cex=1, font=7)

text(-0.005, -4.6, pos=4, c("Overall"))

text(-0.005, 12.8, pos=4, c("Subgroup"))

text(-0.005, -2.7, pos=4, c("Subgroup"))

abline(h=-3.7)

which produces:

Notice that the overall summary proportion is 0.00045 (95% CI=0.00033, 0.00061) under the given

assumption, which is different than the one derived in the absence of subgroups (0.00042). As we

have mentioned before, it would be challenging and time-consuming to create a forest plot using

metafor. Recall that we recommend readers calculate summary proportions under the second as-

sumption so that the estimate of τ 2 can be more accurate, especially when a given meta-analysis

contains less than 5 studies in each subgroup. Fortunately, when we work under the second as-

sumption we can use the meta package to construct forest plots whose syntax for creating graphs

47
Conducting Meta-Analyses of Proportions in R

is highly user-friendly and very easy to learn.

Assumption 2: Assume common between-study variance component (pool within-group

estimates of between-study variance).

subganal.moderator=rma(yi, vi, data=ies/ies.logit/ies.da, mods=~moderator, method="DL")

pes.summary=metaprop(cases, total, authoryear, data=dat, sm="PRAW"/"PLO"/"PFT",

byvar=moderator,

tau.common=TRUE,

tau.preset=sqrt(subganal.moderator$tau2))

forest(pes.summary)

Using the combination of the argument mods= moderator in the rma() function and the argu-

ments byvar=moderator, tau.common=TRUE, and tau.preset=sqrt(subganal.moderator$tau2) in

the metaprop() function allows us to achieve our goal here. To construct a forest plot for the run-

ning example using study design as moderator, use the following code:

subganal.studydesign=rma(yi, vi, data=ies.logit, mods=~studydesign, method="DL")

pes.summary=metaprop(cases, total, authoryear, data=dat,

sm="PLO"

method.tau="DL",

method.ci="NAsm",

byvar=studydesign,

tau.common=TRUE,

tau.preset=sqrt(subganal.studydesign$tau2))

forest(pes.summary,

xlim=c(0,4),

pscale=1000,

rightcols=FALSE,

leftcols=c("studlab", "effect", "ci"),

leftlabs=c("Study", "Proportion", "95% C.I."),

text.random="Combined prevalence",

xlab="Prevalence of CC( )", smlab="",

weight.study="random", squaresize=0.5, col.square="navy",

col.diamond="maroon", col.diamond.lines="maroon",

pooled.totals=FALSE,

comb.fixed=FALSE,

48
Conducting Meta-Analyses of Proportions in R

fs.hetstat=10,

print.tau2=TRUE,

print.Q=TRUE,

print.pval.Q=TRUE,

print.I2=TRUE,

digits=2)

which returns the following forest plot:

Notice that the estimates of τ 2 are the same (0.4427) and the overall summary proportion and

its 95% CI have also changed (0.43; 95% CI=0.31, 0.6). Again, unless you have a large number of

studies in each subgroup or you have a solid reason to believe that the within-group heterogeneity

varies greatly across subgroups, calculating summary proportions and constructing forest plots

with a pooled τ 2 would suffice in most cases.

7.9 R code for conducting meta-regression

In cases where we want to evaluate the effect of a continuous moderator, the R code we would use

is identical to what we would use in a subgroup analysis. This can be achieved with the following

generic code:

49
Conducting Meta-Analyses of Proportions in R

metareg.moderator=rma(yi, vi, data=ies/ies.logit/ies.da, mods=~moderator)

As mentioned above, a combination of significant moderators can be regressed on effect size in a

single model. This can be achieved by adding the plus sign in the mods=argument. The generic

code is as follows:

metareg.moderators=rma(yi, vi, data=ies/ies.logit/ies.da,

mods=~moderatorA+moderatorB+moderatorC+...+moderatorZ)

As a final note, metafor and meta handle proportions equal to 0 or 1 differently. Metafor applies

the 0.5 adjustment for calculating the proportions and the sampling variances whereas meta does

not adjust the counts for computing the proportions themselves, but it does the usual 0.5 adjust-

ment for computing the sampling variances. This different handling of proportions and variances

would lead to small discrepancies between the results, but they are usually negligible.

7.10 R code for visualizing moderator analyses

The code to create scatter plots is partly different depending on which transformation option is

selected. Note that if we want to visualize subgroup analyses, we need to use dummy variables

to create scatter plots (e.g., variables labeled studesg in the running example). Using categorical

variables (e.g., studydesign in the example dataset), however, will create box plots.

Option 1: no transformation

wi=1/sqrt(ies$vi)

size=1+3*(wi-min(wi))/(max(wi)-min(wi))

plot(ies$dummyvar, ies$yi, cex=size)

Option 2: the logit transformation

wi=1/sqrt(ies.logit$vi)

size=1+3*(wi-min(wi))/(max(wi)-min(wi))

plot(ies.logit$dummyvar, transf.ilogit(ies.logit$yi), cex=size)

plot(ies.logit$dummyvar, ies.logit$yi, cex=size) #y-axis unit: logit-transformed value

Option 3: the double-arcsine transformation

wi=1/sqrt(ies.da$vi)

size=1+3*(wi-min(wi))/(max(wi)-min(wi))

50
Conducting Meta-Analyses of Proportions in R

plot(ies.da$dummyvar, transf.ipft.hm(ies.da$yi, targ=list(ni=dat$total)), cex=size)

plot(ies.da$dummyvar, ies.da$yi, cex=size) #y-axis unit: double-arcsine-transformed value

Using the running example, we can create scatter plots with regression lines and corresponding

95% CI bounds for study design with:

subganal.studesg=rma (yi, vi, data=ies.logit, mods=~studesg, method="DL")

wi=1/sqrt(ies.logit$vi)

size=1+3*(wi-min(wi))/(max(wi)-min(wi))

plot(ies.logit$studesg, transf.ilogit(ies.logit$yi), cex=size, xlab="Study Design",

ylab="Proportion")

pes.studesg=predict(subganal.studesg, transf=transf.ilogit, newmods=c(0:2))

lines(0:2, pes.studesg$pred, col="navy")

lines(0:2, pes.studesg$ci.lb, lty="dashed", col="maroon")

lines(0:2, pes.studesg$ci.ub, lty="dashed", col="maroon")

which generates the scatter plot as shown below:

According to a visual inspection of the scatter plot, we can find that the slope of the estimated

regression line is neither completely horizontal nor very steep, suggesting a weak association be-

tween study design and the observed effects. In addition, nearly half of the studies fall outside of

the 95% CI bounds, implying that there might be one or more very important missing factors that

could better account for the heterogeneity in the effect sizes. But we are not certain as to whether

this relationship is significant unless we examine the output for the model. To generate the out-

put, run the following code:

51
Conducting Meta-Analyses of Proportions in R

print(subganal.studesg)

which outputs:

From this output, we can conclude that study design is not a significant moderator (QM (1)=0.92,

p=0.34), which is also supported by the insignificant regression coefficient (0.35; Z (15)=0.96,

p=0.021).

To create a scatter plot for sample size, execute:

subganal.size=rma(yi, vi, data=ies.logit, mods=~size, method="DL")

pes.size=predict(subganal.size, newmods=c(0:2), transf=transf.ilogit)

wi=1/sqrt(ies.logit$vi)

size=1+3 *(wi-min(wi))/(max(wi)-min(wi))

plot(ies.logit$size, transf.ilogit(ies.logit$yi), cex=size, xlab="Sample size",

ylab="Proportion")

lines(0:2, pes.size$pred, col="navy")

lines(0:2, pes.size$ci.lb, lty="dashed", col="maroon")

lines(0:2, pes.size$ci.ub, lty="dashed", col="maroon")

which generates the scatter plot as presented below:

In this case, the slope of the estimated regression line is much steeper. A visual inspection of this

scatter plot informs us that sample size appears to be negatively correlated with the observed pro-

portions. When the sample size is less than 100,000, the proportion is higher whereas when the

sample size is larger than 100,000, the proportion is lower. Again, missing factors results in a cer-

tain amount of omitted variable bias here. This is also confirmed by the mixed-effects model re-

52
Conducting Meta-Analyses of Proportions in R

sults:

Together, the results of the test of moderators (QM (1)=36.43, p<0.0001) as well as the significant

slope coefficient (−1.29; Z (15)=−6.04, p<0.0001) conform to our visual interpretation of the asso-

ciation between sample size and the observed effect sizes. In stark contrast with study design, the

R2 for sample size shows that 57.07% of the true heterogeneity in the observed effect size can be

accounted for by sample size.

The running example does not examine any continuous predictors. However, for illustrative pur-

poses, we can plot the observed effect sizes of the individual studies against the continuous vari-

able publication year based on a mixed-effects model:

metareg.year=rma(yi, vi, data=ies.logit, mods=~ year, method="DL")

wi=1/sqrt(ies.logit$vi)

53
Conducting Meta-Analyses of Proportions in R

size=1+3*(wi-min(wi))/(max(wi)-min(wi))

pes.year=predict(metareg.year, newmods=c(1985:2020), transf=transf.ilogit)

plot(ies.logit$year, transf.ilogit(ies.logit$yi), cex=size, pch=1, las=1, xlab="Month",

ylab="Proportion")

lines(1985:2020, pes.year$pred, col="navy")

lines(1985:2020, pes.year$ci.lb, lty="dashed", col="maroon")

lines(1985:2020, pes.year$ci.ub, lty="dashed", col="maroon")

ids=c(1:17)

pos=c(1)

text(ies.logit$year[ids], transf.ilogit(ies.logit$yi)[ids], ids, cex=0.9, pos=pos)

which generates:

Note the use of the last three lines of code. They are used to label the studies (represented by the

circles) in the graph.

One might notice that the results presented in this section do not match those of Wu et al. ex-

actly. This is because Wu et al. (2012) calculated the overall and subgroup summary proportions

with the DL estimator, but they switched to the REML estimator to conduct subgroup analyses.

The reason is not stated in their report.

54
Conducting Meta-Analyses of Proportions in R

8 The issue of publication bias in meta-analyses of propor-

tions

8.1 Overview of publication bias in the context of meta-analyses of pro-


portions

One of the major threats to the validity of meta-analysis is publication bias. This is the tendency

to submit or accept a study depending on the direction or strength of its results; in other words,

compared with studies with positive and/or significant results, small studies reporting negative re-

sults and/or small effects are less likely to be published and subsequently included in a meta anal-

ysis (Dickerson, 1990; Quintana, 2015). Small studies have to have a more robust effect in order

to be published. Omitting unpublished studies from a review could lead to a biased estimate of

summary effect (Song et al., 2000), because the smaller a study, the larger the effect necessary for

the results to be found statistically significant (Sterne et al., 2000). Moreover, including published

small studies with large effects and not including negative studies at the same time could yield an

overestimation of true effect (Cuijpers, 2016).

Studies included in meta-analyses of proportions are observational and non-comparative, and thus

do not calculate significant levels for their results. This means that results reported in this kind of

studies cannot be interpreted as “positive/negative” or “desirable/undesirable.” Accordingly, sta-

tistical non-significance is unlikely to be an issue that may have biased publications (Maulik et al.,

2011). In practice, authors reporting low proportions (e.g., rare event rates) are just as likely to

have their work published as those reporting very high proportions (e.g., high cure rates). There-

fore, we believe that traditional publication bias modelling tools developed for randomized con-

trolled trials (i.e., funnel plot asymmetry analyses) are less useful in the context of meta-analyses

of observational studies (detailed explanation is provided below). However, authors of meta-analyses

of proportions keep using these tests and conclude that no publication bias is detected in their

studies when a significant relationship between study size and study effect is not found by the

tests.

8.2 Detecting publication bias through visual inspection of funnel plots


in meta-analyses of randomized controlled trials (RCTs)

The first step of exploring publication bias is to create a visual tool known as a funnel plot (Light

& Pillerner, 1984). Estimates of effect size are plotted on a funnel plot as circles, the distribution

55
Conducting Meta-Analyses of Proportions in R

of which represents the association between study effect and study size (Sterne et al., 2000). A

funnel plot is essentially a scatter plot of study effect estimates on the x-axis (in the case of a

meta-analysis of proportions, the logit- or double-arcsine- transformed proportion) against some

measure of study size on the y-axis (usually the standard error). In other words, on a funnel plot,

the location of a circle is plotted according to the sample size or precision (as precision depends

largely on sample size) of the corresponding study. A vertical line is situated at the value of the

(transformed) summary effect on the funnel plot. When a study is larger, it is more precise, and

thus its effect size is more similar to the summary effect size and it has a lower standard error;

when a study is smaller, it has less precision, and thus its effect size differs more from the sum-

mary effect size and it has a larger standard error. The result of this is, circles representing smaller

studies are broadly spread towards the bottom of the funnel plot and digress further from the cen-

ter line, whereas circles representing larger studies are distributed more narrowly towards the up-

per area of the graph, symmetrically clustered around the vertical line due to random sampling er-

rors (Sterne et al., 2000; Anzures-Cabrera & Higgins, 2010). In general, in the presence of publica-

tion bias, studies with null or undesired results will not be present (Littell et al., 2008), and thus

many circles at the bottom of a funnel plot and a few in the middle tend to be absent, making

the plot asymmetrical (Borenstein et al., 2005; Petrie et al., 2003). The empty area where circles

are not present could appear on either side of the center line, depending on the desired direction

of the effect. A funnel plot also consists of two limit lines indicating the 95% CI around the sum-

mary effect size, which can serve to visualize the extent of heterogeneity in true effects (Anzures-

Cabrera & Higgins, 2010): the more circles that lie beyond the two limit lines, the more likely it is

that heterogeneity is high.

8.3 An important caveat: Funnel plot asymmetry does not equal to pub-
lication bias

It is crucial for readers to know that the presence of funnel plot asymmetry does not necessar-

ily indicate publication bias (Hunter et al. 2014). Asymmetry may also arise due to numerous

mechanisms other than publication bias and heterogeneity (Egger et al., 1997), such as hetero-

geneity in true effects as we have mentioned above, English language bias (i.e., English speaking

researcher fail to include small studies that have been published in foreign languages), citation

bias (i.e., researchers failing to locate small studies in the search for relevant studies because they

are quoted less frequently in papers), etc., all of which can make the inclusion of small studies

56
Conducting Meta-Analyses of Proportions in R

less likely. However, when a small study shows a stronger effect due to various reasons (e.g., poor

study methodology), its chance of getting published is increased (Schwarzer et al., 2015). This

tendency, known as the “small-study effect”, can also contribute to funnel plot asymmetry (Sterne

et al., 2000). Visual inspection of funnel plots, however, might not allow one to draw concrete in-

terpretations and may even lead to misleading conclusions when such plots do not show a distinct

pattern suggesting evident publication bias (Card, 2015). In their study, Terrin et al. (2005) found

that participants (medical researchers) erroneously interpreted around 50% of funnel plots with

different levels of asymmetry as being affected or unaffected by publication bias by looking at the

graphs. This is not surprising because a funnel plot may not look very asymmetrical in the pres-

ence of publication bias, whereas it may not look very symmetrical in the absence of bias. Sterne

et al. (2000) suggested that funnel plots should be employed as a generic tool to identify small-

study effects instead of a means of diagnosing specific types of bias. In their study, Hunter et al.

(2014) examined the utility of funnel plots in detecting publication bias in meta-analyses of pro-

portions. They concluded that traditionally constructed funnel plots using standard error (SE)

or the inverse of SE on the y-axis may be a potentially misleading method for the detection of

publication bias because these methods are suspected to over-estimate the degree of funnel plot

asymmetry in cases where observed proportions are extreme. They suggested using sample size

as the measure of precision on the y-axis in dealing with extremely low or high proportions. In

summary, the interpretation of a funnel plot has an inherent degree of subjectivity, which makes it

often problematic (Vevea & Woods, 2005).

8.4 Detecting publication bias with formal tests: rank correlation test,
Egger’s regression test, and trim-and-fill

A more objective way to assess funnel plot asymmetry and unearth publication bias is to assess

the relationship between effect size and its precision (e.g., standard error) using linear regression

analysis. The presence of a strong relationship between these provides evidence of asymmetry in

the funnel plot and suggests the possibility of publication bias, while the absence of an relation-

ship suggests otherwise (Card, 2015; Rendina-Gobioff & Kromrey, 2004; Sterne et al., 2000). Two

variants of this linear regression analysis approach that are commonly employed are the rank cor-

relation test (Begg & Mazumdar, 1994) and Egger’s regression test (Egger et al., 1997). The rank

correlation test examines if observed study effects and their sampling variances are significantly

associated. A significant degree of correlation may suggest publication bias. A major limitation of

57
Conducting Meta-Analyses of Proportions in R

this test is its highly variable power. Generally, the test is quite powerful for large meta-analyses

involving more than 75 studies. Nonetheless, the power will become moderate when it is employed

in small meta-analyses with fewer than 25 studies (Rothstein et al., 2006). Thus, a non-significant

test result cannot be taken as evidence of a lack of publication bias when the meta-analysis is

small. Egger’s regression test is a weighted linear regression test in which the standardized ef-

fect estimate (i.e., the quotient of effect size and its standard error) is regressed against precision

(i.e., the inverse of the standard error). In other words, we can assess the publication bias by us-

ing precision to predict the standardized effect, with a significant result suggesting the possible

presence of publication bias. This test allows us to better detect publication bias in small meta-

analyses unless these studies are based on a small number of small studies. When publication bias

is suspected, the trim-and-fill method (Duval & Tweedie, 2000) can be used to estimate the num-

ber of missing studies that might exist in a meta-analysis and to determine where they might fall

on a funnel plot and visualize them in an attempt to increase the plot’s symmetry. Most impor-

tantly, it is able to add these missing studies to the analysis, recalculate the summary effect size,

and yield the best estimate of the unbiased summary effect size (Borenstein et al., 2005). It can

thus be used as a sensitivity analysis to assess the difference between the trim-and-fill-adjusted

and observed estimates; if the difference is negligible, the validity of the estimated summary effect

size will prove robust (Duval, 2005). However, the trim-and-fill performs poorly when study effects

are heterogeneous (Terrin et al., 2003). In addition, authors of meta-analyses of proportions ought

to be aware that the trim-and-fill approach depends strictly on the assumption that all missing

studies are those with the most negative or undesirable effect sizes. This assumption may be ques-

tionable when trim-and-fill is applied to meta-analyses of proportions with small studies because

it is possible that the effect sizes in such studies are actually large or small for a particular reason

and thus should not be considered “missing.” Put differently, a gap in the left-hand or right-hand

corner of a funnel plot may exist due to a particular reason rather than publication bias. Violat-

ing this assumption can lead to over-correction as pointed out by Vevea & Woods (2005).

8.5 An important caveat: A significant p-value is not indicative of the


presence of publication bias

Current methods for detecting publication bias and gauging its effects are based on the following

assumptions: (a) Large studies are published preferentially regardless of results. (b) Small studies

are unlikely to be published unless they have large effects. (c) Studies with a medium sample size

58
Conducting Meta-Analyses of Proportions in R

that have significant results are published. In other words, as the sample size of a study decreases,

the more likely the study is affected by publication bias (Borenstein, et al., 2005). As we can see,

traditional approaches of modelling publication bias, such as the aforementioned trim-and-fill, the

rank correlation test, and Egger’s regression model, as well as the more sophisticated weighted se-

lection approaches (e.g., Vevea & Hedges, 1995; Vevea & Woods, 2005) have all taken the assump-

tion that the likelihood of a study getting published depends on its sample size, statistical signif-

icance, or the direction of its results (Coburn & Vevea, 2015). Although it has been confirmed by

empirical research that statistical significance plays a dominant role in the publication of studies

(Preston et al., 2004), this is not entirely the case. In fact, the underlying publication selection

process across different fields is far more convoluted. Cooper et al. (1997) have demonstrated that

the decision as to whether or not to publish a study is influenced by a variety of criteria or “fil-

ters” set by journal editors and reviewers regardless of methodological quality and significance,

including but not limited to, the source of funding for research, social preferences related to race

and gender at the time when a research study is conducted, and even research findings that chal-

lenge previously held beliefs. In practice, authors of meta-analyses of proportions have employed

these methods in an attempt to detect publication bias. However, due to the unique nature of

studies included in meta-analyses of proportions (i.e., that they are non-comparative studies), the

p-value may not be a concern in the publication selection process, as a result of which these tradi-

tional methods may not be able to fully explain the asymmetric distribution of effect sizes on fun-

nel plots. Since the traditional methods fail to capture the full complexity of the selection process,

it is also possible that they may fail to identify publication bias in meta-analyses of proportions as

publication bias in non-comparative studies may arise for reasons independent of a lack of signifi-

cance. Hence, any conclusions regarding the presence of publication bias based on these methods

should be drawn with caution.

8.6 R code for creating funnel plots and perform asymmetry tests

As we have discussed earlier, when the goal is estimation of a single summary proportion rather

than a comparison of treatments, interventions, or methods, then publication bias is not really

pertinent. One can, of course, generate funnel plots and use tests (e.g., Egger test) to examine

whether the distribution of effect size estimates follows what one would ordinarily anticipate (i.e.,

less variation with higher number of studies constructs a roughly symmetric dispersion about the

mean) and to detect whether the small-study effect is present. Past that, the exercise is not very

59
Conducting Meta-Analyses of Proportions in R

informative.

The generic code to construct funnel plots is as follows:

X-axis scale: logit or double-arcsine transformed value

funnel(pes.logit or pes.da)

X-axis scale: proportion

funnel(pes)

funnel(pes.logit, atransf=transf.ilogit)

funnel(pes.da, atransf=transf.ipft.hm, targ=list(ni=dat$total))

To create a funnel plot for the running example, we would execute the following code:

funnel(pes.logit, yaxis="sei")

which produces:

There is clear evidence of heterogeneity and funnel plot asymmetry. There is also an indication of

small-study effect, even though the effect is not very evident. If we follow the suggestion of Hunter

et al. (2014) and use sample size as the measure of precision, we can change the argument yaxis=

“sei” to yaxis=“ni” in order to see if the asymmetry is induced by the method of funnel plot con-

struction. The new funnel plot is shown below:

It is evident that small-study effects truly exist in this meta-analysis study. Unfortunately, the

large differences between the small studies and the big studies are not explained in the original

report.

Now we conduct a rank correlation test to examine the relationship between the sample size and

the observed effect size of each study using the following code and the result is as follows:

60
Conducting Meta-Analyses of Proportions in R

rank (pes/pes.logit/pes.da)

Despite clear evidence to the contrary shown in the funnel plot the rank correlation test fails to

find a significant relationship between sample size and effect size. The reason could be that the

rank correlation test has low power when examining a small number of studies (we have men-

tioned that a meta-analysis of less than 25 studies is considered small and the current meta-analysis

consists of 17 studies).

Egger’s regression test performs better than the rank correlation test when the number of included

studies is small. It is important to note that the traditional weighted Egger’s regression test (Eg-

ger et al., 1997) is no longer advocated by Egger et al. due to a lack of theoretical justification

(Rothstein, et al., 2005). Therefore we will conduct only the unweighted regression test, which is

obtained by executing the following generic code:

regtest(pes/pes.logit/pes.da, model="rma", predictor="sei")

The Egger test shows that the funnel plot is significantly asymmetrical:

61
Conducting Meta-Analyses of Proportions in R

References

I cannot show you the references until the tutorial is published in a scientific journal. If you are

really interested in one or more articles that I cited in this tutorial, feel free to contact me via

email. Thank you for your understanding!

62

View publication stats

You might also like