Skip to contents

This vignette focuses on the (STM) Generalized Linear Mixed Model (GLMM) for Meta-Analysis of Diagnostic Studies as presented by Chu and Cole (2006) and Chu et al. (2010). The goal of the model is to jointly analyze sensitivity and specificity across studies while accounting for both within-study variability and between-study heterogeneity. Firstly, we introduce the model itself, explaining how sensitivity and specificity are defined within the Chu GLMM and how the estimated parameters can be used to derive the SROC curve and AUC. Afterwards, we focus on how the model can be fitted and how the SROC curve can be computed, providing a brief overview of the internal steps involved in these calculations.

Model Specification

We begin by specifying the sampling model for the observed data within each study. Let \(n_{i11}\), \(n_{i00}\), \(n_{i01}\), and \(n_{i10}\) denote the numbers of true positives, true negatives, false positives, and false negatives reported in study (i). Furthermore, let \(n_{i1+} = n_{i11} + n_{i10}\) and \(n_{i0+} = n_{i01} + n_{i00}\) denote the numbers of diseased and non-diseased individuals in the study.

Conditional on these totals, the numbers of true positives and false positives are assumed to follow binomial distributions:

\[ n_{i11} \sim \text{Bin}(n_{i1+}, Se_i) \]

\[ n_{i01} \sim \text{Bin}(n_{i0+}, 1 - Sp_i) \]

where \(Se_i\) and \(Sp_i\) represent the sensitivity and specificity in study (i). This formulation captures the within-study sampling variability of the observed diagnostic outcomes.

Since sensitivity and specificity are probabilities restricted to the interval \([0,1]\), it is convenient to transform them to the real line before modeling them with random effects. To achieve this, we apply a monotone link function \(g(\cdot)\) to the sensitivity and the false positive rate:

\[g(Se_i) = \mu_i \] \[g(1 - Sp_i) = \nu_i \]

Common choices for the link function include the logit, probit, and complementary log-log transformations. The GLMM framework allows us to flexibly choose among these transformations and to evaluate which one provides the best fit to the data. Model selection criteria such as AIC or BIC can be used to choose the best link to be used for the dataset of interest.

Random Effects

To account for heterogeneity across studies, the transformed sensitivity and false positive rate are modeled jointly using a bivariate normal distribution:

\[(\mu_i, \nu_i)^T \sim N(\mu, \Sigma)\]

with mean vector

\[\mu =\begin{pmatrix}\mu_0 \\ \nu_0\end{pmatrix}\]

and covariance matrix

\[\Sigma =\begin{pmatrix}\sigma_\mu^2 & \rho \sigma_\mu \sigma_\nu\\ \rho \sigma_\mu \sigma_\nu & \sigma_\nu^2\end{pmatrix}.\] This random-effects structure captures two important features of diagnostic meta-analysis. First, the variances \(\sigma_\mu^2\) and \(\sigma_\nu^2\) represent the between-study heterogeneity in diagnostic performance. Second, the correlation parameter \(\rho\) allows sensitivity and specificity to be statistically related across studies, which commonly occurs when studies apply different diagnostic thresholds.

Summary Measures

Based on this model, summary measures of diagnostic accuracy can be obtained by transforming the mean parameters back to the original probability scale. In particular, the median sensitivity and median specificity across studies are given by

\[Se_M = g^{-1}(\mu_0)\]

\[Sp_M = 1 - g^{-1}(\nu_0).\]

In this context, the median is often preferred over the mean because the distributions of sensitivity and specificity across studies can be skewed.

SROC Curve

The model also provides a convenient way to derive an SROC curve. Under the assumption that the transformed parameters follow a bivariate normal distribution, the regression of \(g(Se)\) on \(g(1-Sp)\) leads to

\[g(Se) = (\mu_0 - \rho \nu_0 \sigma_\mu / \sigma_\nu) +(\rho \sigma_\mu / \sigma_\nu)[g(1 - Sp)].\]

This may also be approximated by the median sensitivity for a given specificity

\[ M(Se \mid Sp) = g^{-1} \left\{ (\mu_0 - \rho \nu_0 \sigma_\mu / \sigma_\nu) + (\rho \sigma_\mu / \sigma_\nu)\ [g(1 - Sp)] \right\}. \]

After applying the inverse link function, this relationship defines the SROC curve on the original sensitivity–specificity scale.

The area under the summary ROC curve (AUC) can then be approximated by integrating the median SROC curve across all specificities:

\[AUC_M = \int_0^1g^{-1}\left\{(\mu_0 - \rho \nu_0 \sigma_\mu / \sigma_\nu)+(\rho \sigma_\mu / \sigma_\nu)\ [g(1 - Sp)]\right\} \, dSp.\]

Model Application in MetaROC

Inside the package, the model can be fitted by either the fit_metaROC() function or the metaROC.metaROC() method when setting action = "estimate". The model can then be fitted by setting model = "chu2006glmm" and choosing a link. Here we will just use the default "logit" link:

library(metaROC)
set.seed(7)
data(hba1c)
stm_hba1c <- hba1c[hba1c$originally_published == 1,]
fit_chu <- fit_metaROC(stm_hba1c, model = "chu2006glmm")
## Hello and welcome to metaROC!
## Requested model: chu2006glmm 
## This is a GLMM for a single threshold per study.
##  See https://doi.org/10.1177/0272989X09353452 for more details.
est_chu <- metaROC(action ="estimate", data = hba1c, model = "chu2006glmm")
## Hello and welcome to metaROC!
## Chosen action: estimate
## Warning in metaROC(action = "estimate", data = hba1c, model = "chu2006glmm"): You are trying to fit an STM to a real-world dataset that may report multiple thresholds for
##                     one or more studies. Please specify which threshold should be used by providing a column named
##                     threshold_stm. Otherwise, each reported threshold will be treated as a separate study.
## Hello and welcome to metaROC!
## Requested model: chu2006glmm 
## This is a GLMM for a single threshold per study.
##  See https://doi.org/10.1177/0272989X09353452 for more details.

As we have already filtered the HbA1c dataset using the originally_published column, we do not need to worry about the warning, since we have only one entry per study.

The model is fitted internally using glmer()(Partlett and Yemisi (2021)) from the lme4 package (Bates et al. (2015)). The model formula always includes the counts of successes and failures (cbind(true, n - true)) on the left-hand side, with study-specific sensitivity and specificity as fixed effects and corresponding random effects by study.

By specifying the link argument, the user one can choose the transformation applied to the sensitivity and false positive rate, such as logit, probit, or complementary log-log, with the default being the logit link.

Once the model is fitted, we can then use the summary.metaROC() method to gain an overview of the estimation results:

summary(fit_chu, ci_type = "wald")
## 
## *** Results of Single threshold method (STM) ***
## 
## Model: GLMM for single threshold 
## Link: logit 
## 
## Total number of studies: 38 
## Total number of thresholds: 38 
## Number of different thresholds: 13 
## 
## Results with Wald confidence intervals: 
## 
## Youden index (sensitivity weight = 0.5): 0.5528
## Estimated Sensitivity and Specificity [95% CI]:
##  Sens: 0.7276 [0.6739; 0.7754]
##  Spec: 0.8107 [0.7650; 0.8493]
## 
## AUC: 0.8134

Firstly, the summary reminds us which model was fitted together with the link we used. Then a basic overview of the meta-analysis is provided. The output then provides the largest Youden index, although, since the threshold itself is not included in the model, this does not provide additional interpretative insights.

Following this, the summary reports the estimated sensitivity and specificity, along with their corresponding 95% confidence intervals. Finally, the AUC is displayed.

By calling the SROC function we can also gain the fully estimated SROC curve from the model

SROC <- SROC(fit_chu)
head(SROC$sroc_df, 10)
##    specificity sensitivity youden_index
## 1         0.01   0.9806823 -0.009317697
## 2         0.02   0.9730092 -0.006990782
## 3         0.03   0.9671546 -0.002845389
## 4         0.04   0.9622204  0.002220414
## 5         0.05   0.9578624  0.007862387
## 6         0.06   0.9539054  0.013905375
## 7         0.07   0.9502460  0.020245993
## 8         0.08   0.9468172  0.026817237
## 9         0.09   0.9435728  0.033572764
## 10        0.10   0.9404789  0.040478947

The SROC for the Chu GLMM is constructed as illustrated in the model specification section. Conceptually, the function first extracts the fixed and random effects from the fitted GLMM, using VarCorr() from lme4, including the variances and covariance of the sensitivity and specificity random effects. The estimated fixed effects are back-transformed using the specified link function to obtain study-level median sensitivity and specificity with confidence intervals. The SROC curve itself is generated by computing the median sensitivity for a grid of false positive rates, taking into account the correlation between sensitivity and specificity through the random-effects covariance. Finally, the curve is returned along with the computed area under the curve (AUC), the sensitivity–specificity data frame, and standard errors, providing a complete summary of the model-based summary ROC.

Currently, the package does not support simulating data from this model, so this concludes the discussion of the Chu GLMM.

To gain an overview of how to plot a fitted model and how to conduct simulation studies, particularly for evaluating models such as the Chu GLMM, please refer to the other vignettes included in this package, which provide more detailed guidance on these topics.

References

Bates, Douglas, Martin Mächler, Benjamin M. Bolker, and Steven C. Walker. 2015. “Fitting Linear Mixed-Effects Models Using Lme4.” Journal of Statistical Software 67: 1–48. https://doi.org/10.18637/JSS.V067.I01.
Chu, Haitao, and Stephen R Cole. 2006. “Bivariate Meta-Analysis of Sensitivity and Specificity with Sparse Data: A Generalized Linear Mixed Model Approach.” Journal of Clinical Epidemiology 59 (12): 1331–32. https://doi.org/10.1016/j.jclinepi.2006.06.011.
Chu, Haitao, Hongfei Guo, and Yijie Zhou. 2010. “Bivariate Random Effects Meta-Analysis of Diagnostic Studies Using Generalized Linear Mixed Models.” Medical Decision Making 30 (4): 499–508. https://doi.org/10.1177/0272989X09353452.
Partlett, Christopher, and Takwoingi Yemisi. 2021. Meta-Analysis of Test Accuracy Studies in r: A Summary of User-Written Programs and Step-by-Step Guide to Using Glmer. https://methods.cochrane.org/sdt/software-meta-analysis-dta-studies.