Meta-Analysis of a Very Low Proportion Through Adjusted Wald Confidence Intervals

Meta-analysis, which is a statistical technique for combining the findings from independent studies, can be used in many fields of research, having high importance in clinical and epidemiological contexts [1]. The meta-analysis for one effect size provides a precise estimate effect and increase statistical power by combining results of independent studies. In the meta-analysis, the pooled effect size measure (θˆ) could be seen as a linear combination of several effect sizes (θi, i = 1...k) of k≥2 studies


Introduction
Meta-analysis, which is a statistical technique for combining the findings from independent studies, can be used in many fields of research, having high importance in clinical and epidemiological contexts [1]. The meta-analysis for one effect size provides a precise estimate effect and increase statistical power by combining results of independent studies. In the meta-analysis, the pooled effect size measure (ˆ) θ could be seen as a linear combination of performing meta-analysis that we will consider in this study: fixed-effect and random-effects models. The model of fixed effect is adequate when homogeneity in effect sizes exists between studies. According to the inverse variance (IV) method, 2 1 / i i w σ = where 2 i σ is the sampling error variance associated to the i th study. In the presence of heterogeneity between studies, it was usually used the random-effects model, which incorporates the within-study variance Under random-effects model context, there are closed form and iterative methods to estimate the between-study variance, in [1] can be found 16 distinct methods. However, the most popular procedure to estimate the betweenstudy variance is the Der Simonian and Laird (DL) method [2], and it is the default option in several meta-analysis software's. Here, we focus in the most popular meta-analysis methods, resuming the methods by some differentiator features (Table 1). For the random-effects models the methods described in Table 1 use the estimator τˆ2 of between-study variance/ and Mandel (PM) method [3] is similar to DL method with other τ . The PM approach provide an iterative process to compute τ , a disadvantage of this iterative estimator is that they depend on the choice of maximization method and the convergence could fail. There are other similar alternative methods (with the above mentioned), for example, the method proposed by Sidik & Jonkman SJ [4] is similar to Hartung and Knapp (HK) method with other τ estimator proposal. The DL estimator of between-study variance appears to perform adequately. However, several simulation studies have been proposed to compare distinct meta-analysis procedure and there are several specific conclusions. According [5], the most recent review, PM method appears to have a more favorable profile among other estimators of between-study variance. The traditional meta-analysis assumes the approximate normal within-study model, which may not a good option in the context of rare events. The rare events topic is widely discussed in the literature. Several approaches have been proposed for the meta-analysis of two treatment groups with rare events (see, for example, the Pepto's method [6]; approaches based on Poisson random-effects model [6]; approaches based on generalized linear mixed model [7]; unweighted methods [8]; and a review of methods for the metaanalysis of incidence of rare events [9]). And Sweeting et al. [10].
show that the continuity correction is not an adequate option in meta-analysis of rare events.    Here, we focus on the meta-analysis of one proportion. The meta-analysis of proportions is usually carried out via three well known methods: the classic Wald method (Wald-0) [11], the logit transform, and the Freeman-Tukey double arcsine transform methods [3]. The inverse of the Freeman-Tukey double arcsine transformation method is preferred over the logit and classic Wald methods [3], in particular for large or small proportions. The binomial-normal model was proposed in the context of exact within study likelihood models and was suggested for the meta-analysis of one proportion with rare events [11,12] (method described in Table 1 as GLMM method). In the meta-analysis of one proportion p, the Wald method can be seen as a linear combination of k estimated . We think, there is room for improvement in the case of the estimation of the pooled proportion for a single treatment group in a rare events context. In this paper, we discuss an adapted application of the adjusted Wald method for a linear combination of proportion in the meta-analysis of quasi extreme proportion p≤0.05.

Adjusted wald estimate
Consider k independent binomial studies, 1 ... k X X k , and denote by p i the proportion of success (the effect size under study) and by n i the size of the i th study. In each study, an adjusted estimate of the effect size , 1, ..., , denoted by i p  , is calculated using the parametric family of shrinkage estimators [4,10] , (2) Depending on the particular h i parameter chosen, several different estimators can be considered (Table 2) , which converges asymptotically in distribution to a normal distribution, There are several variants of the Wald method to estimate the linear combinations of binomial proportions [1,2,10], where the approximate confidence interval CI for L could be, in general, given by q  in the estimation of the weight w i . There are several approximate adjusted Wald confidence intervals (CIs) for the meta-analysis of one proportion. The approach developed here uses a parametric family of shrinkage estimators for estimating the proportions. Two popular statistical models, the fixed-effect model and the random-effects model, are used and discussed in this paper. Under the fixed-effect model, we assume that all studies in the analysis share the same true effect size p, that is, p 1 = ··· = p k = p, and the adjusted pooled prevalence is given by where w i is the adjusted weight assigned to each study, where w i is the adjusted weight assigned to each study (as previously defined, is unknown). Taking into account the inverse variance method, the weight could be estimated by is a within-study adjusted variance estimate. Note that the weights and the estimated weights are interchangeably denoted by w i in this paper. Under the randomeffects model, we assume that the true effect size could vary from study to study, in addition to the sample error (σ i 2 ) there is variability between studies (τ 2 ). The pooled prevalence is also given by equation 5, the weight assigned to each study could assume several estimates (Table 1).

Transformation methods
The meta-analysis can also be performed by using a linear combination of transformed proportions, where several metaanalysis methods were applied to the transformed sampling proportion. The approach based on the transformation of one proportion was typically applied to overcome two well-known constraints: the support range between 0 and 1 and a non- respectively, which are the ones we will consider in this paper when performing the comparison between the meta-analyses of one proportion. The meta-analysis will be applied under the transformed effect size (proportion) and the back transformations will be applied to obtain the result to estimate the overall proportion. In the meta-analysis of proportion, the Freeman-Tukey double arcsine approach it was taken as default and was point as the preferred transformation [13]. The main working example of this work is the prevalence meta-analysis of a multiresistant Staphylococcus aureus bacteria, carrying the new mecC gene [8]. This metanalysis included 25 studies and the sample sizes n i range between 6 and 56382, there are six studies with no occurrences of mecC gene in Staphylococcus aureus. And there are strong and significant heterogeneity between studies included in this metaanalysis.

Monte carlo simulations
To perform the simulation studies, we closely follow the order of magnitude of the prevalence for the mecC gene in Staphylococcus aureus [8]. The overall estimated prevalence of the mecC gene was obtained through the Freeman-Tukey double arcsine transformation method and the random-effects model. The pooled prevalence obtained through this method was 0.009 (95% CI = 0.005-0.013). A simulation study was carried out to compare the adjusted Wald CIs amongst themselves and with those obtained by the best-known transformations: logit and Freeman-Tukey double arcsine. We used a Monte Carlo simulation to compare the performance of the different CIs. The performance of each CI was analysed under the random-effects and fixed-effect models [14], although we are mainly interested in discussing the results arising from the random-effects model. We discuss the meta-analysis methods for the proportion effect size (e.g. prevalence, incidence), pointing our interest in rare events that take into account the practical problem of estimating one low or very low prevalence/incidence. Motivated by the small prevalence values and the simulation scenario proposed in [15]

Results
We assessed the performance of each CI using the fixed-effect and random-effects models. We studied the performance of the methods considering 25 studies for small proportions. Since in the main working example, the meta-analysis of the prevalence of the mecC Methicillin-resistant Staphylococcus aureus (MRSA), there was a total of 25 studies with rare events. Table 3-5 show the performance of the seven estimation methods under analysis for the fixed-effect and random-effects models, in the context of small proportions (0.05,0.01 and 0.001) for k = 25 using ˆτ as DL or PM estimator, respectively. DL estimator has chosen to analyze the performance in meta-analysis of this small proportion due its popularity and PM estimator was chosen since it was indicated as having overall good performance [16]. Taking into account the proportions under study and the most used methods (IV and DL method with Wald-0, Logit or double arcsine approach) to perform the overall interval estimation, the variant-3 or variant-4 of the adjusted Wald method is a credible competitor to traditional methods, in some cases with the best coverage probability (Tables  3 & 4). By using PM estimator, the results provided by the methods Wald-3 and Wald-4 exhibited comparatively with the other methods, better results, yet no pattern was identified between the different results (Table 5). We also applied our simulation procedure to other methods proposed in the cited literature (Table  1). For the scenario under study, p∈ {0.05,0.01,0.001}, we obtain other methods with good accuracy (Table 6). We sort the results by the coverage probability -CP (Table 6 or complete tables in the supplementary material) and we observe unweighted/double arcsine as the best coverage probability method for p=0.05 and p=0.01 and DL/Wald-3 method for p=0.001 [17][18][19].

Discussion and Conclusion
The meta-analysis for the mecC MRSA example was performed with the random effects model due the presence of significant heterogeneity. Motivated by our simulation study, we re-estimated the pooled prevalence using the alternatives with best coverage probability [20,21]. The estimated prevalence's, obtained by the methods with better performance in probability of coverage, are lower than that presented by [8], (approximately 0.004 vs 0.009, Table 7). Although the estimated prevalence has halved, the updated results did not show significant differences, since the 95% CI overlaps the previous one. We agree that the most used methods in prevalence meta-analysis, such as Simonian & Laird [8] method with Freeman-Tukey double arcsine transformation, are in general good methods for the meta-analysis of one proportion. However, in the case of estimating small proportions with the random-effects model for a large number of studies, other alternative methods can produce better results, in particular procedures incorporating variant-3 and variant-4 of the Wald method could provide better results than the methods based on the Freeman-Tukey double arcsine transformation. In the context of the random-effects model, there is still room for improvement, as the non-coverage probability, in some scenarios, is still far from the nominal value (5%) [22][23][24]. Given the computational power increase and the existence of several alternative methods of meta-analysis where the performance depends of the effect size magnitude, we advise in each case, to carry out a simulation study to evaluate the accuracy of the method in each particular proportion magnitude.