This vignette focuses on the discrete time-to-event model for meta-analysis of full ROC curves as presented by Stoye et al. (2024). The goal of this approach is to model the entire diagnostic accuracy profile across all thresholds reported in each study, rather than relying on single summary points. By framing sensitivity and specificity across thresholds as discrete “events” in a time-to-event framework, the model estimates the probability of correct classification at each threshold while accounting for between-study heterogeneity. This allows the construction of a full ROC curve that reflects both within- and between-study variability.
Firstly, we describe the model itself, explaining how the discrete-time hazard functions for diseased and non-diseased individuals are defined at each threshold and how they translate into sensitivity and specificity estimates. Next, we discuss the incorporation of random effects, which model the variation in classification probabilities across studies. Furthermore, we provide an overview of how these estimates can be used to construct the SROC curve and summarize overall diagnostic accuracy across studies. Finally, we explain how the model is implemented and applied within the package.
Model specification
The discrete-time time-to-event model treats each reported diagnostic threshold as defining an interval for the continuous test values \(v_{ij}\). Because individual test values are not directly observed, they are interval-censored, with
\[ T_j = c_k \quad \text{if} \quad v_j \in (c_{k-1}, c_k], \quad k = 1, \dots, K+1, \]
where \(c_0\) and \(c_{K+1}\) represent extreme values. The discrete survival function is defined as
\[ S(c_k \mid x) = P(T \ge c_k \mid x), \quad k = 1, \dots, K \]
and the discrete hazard at threshold \(c_k\) is
\[ \lambda(c_k \mid x) = P(T = c_k \mid T \ge c_k, x) = \frac{f_k}{S_k}. \]
Equivalently, the survival function can be expressed as a product of the past hazard complements:
\[ S(c_k \mid x) = \prod_{t=1}^{k-1} \big(1 - \lambda(c_t \mid x)\big). \]
The proportional hazards model for the discrete-time hazard can be written in terms of the hazard odds as
\[ \frac{\lambda(c_k \mid x)}{1 - \lambda(c_k \mid x)} = \frac{\lambda_0(c_k)}{1 - \lambda_0(c_k)} \exp(x^\top \beta), \]
where \(\lambda_0(c_k)\) is the baseline hazard and \(\beta\) is a vector of parameters. Applying a link function \(g\) maps the hazard to the linear scale. Two common choices are:
- Logit link:
\[ \text{logit}(\lambda(c_k \mid x)) = \log \frac{\lambda(c_k \mid x)}{1 - \lambda(c_k \mid x)} = \alpha_{0k} + x^\top \beta, \]
- Complementary log–log (cloglog) link:
\[ \text{cloglog}(\lambda(c_k \mid x)) = \log(-\log(1-\lambda(c_k \mid x))) = \alpha_{0k} + x^\top \beta. \]
Inversion of the cloglog link provides the hazard at threshold \(c_k\):
\[ \lambda(c_k \mid x) = 1 - \exp\big(-\exp(\alpha_{0k} + x^\top \beta)\big) = h(\alpha_{0k} + x^\top \beta), \]
where \(h(x) = 1 - \exp(-\exp(x))\) corresponds to the cumulative distribution function of the Gompertz distribution. Inversion of the logit link gives
\[ \lambda(c_k \mid x) = \frac{\exp(\alpha_{0k} + x^\top \beta)}{1 + \exp(\alpha_{0k} + x^\top \beta)}, \]
corresponding to the logistic cdf.
To account for between-study heterogeneity, a study-specific bivariate random effect \((\gamma_1, \gamma_2)\) is included, where \(\gamma_1\) represents the nondiseased population and \(\gamma_2\) the diseased population:
\[ (\gamma_1, \gamma_2)^\top \sim N \Big( \mathbf{0}, \begin{pmatrix} \sigma_h^2 & \rho \sigma_h \sigma_d \\ \rho \sigma_h \sigma_d & \sigma_d^2 \end{pmatrix} \Big). \]
The complete linear term for an individual \(j\) is therefore
\[ g(\lambda_{c_k j} \mid x) = \alpha_{0k} + x_{1j}^\top \beta_1 + x_{1j}^\top \beta_2 x_{2j} + \gamma_1 x_{2j} + \gamma_2 (1-x_{2j}), \]
where \(x_1\) encodes the threshold intervals as a categorical variable, \(x_2\) indicates disease status (\(1\) for nondiseased, \(0\) for diseased), and \(g\) is the chosen link function (logit or cloglog). Using the binomial GLMM formulation, the model can equivalently describe the number of events in each interval:
\[ y^h_{ik} \sim \text{Bin}(H_i, \lambda^h_{ik}), \quad y^d_{ik} \sim \text{Bin}(D_i, \lambda^d_{ik}), \]
with \(\lambda^h_{ik}\) and \(\lambda^d_{ik}\) the discrete hazards for nondiseased and diseased individuals in study \(i\) at threshold \(c_k\), respectively. This formulation is computationally more efficient while preserving the full likelihood structure of the model.
Estimation of Sensitivity and Specificity
The main quantities of interest are sensitivity and specificity at different thresholds, which can then be used to construct the SROC curve. Using the cloglog-link, the sensitivity at threshold \(c_k\) for diseased individuals is the probability of having a test value exceeding the threshold:
\[ \text{sens}(c_k \mid x) = P(v > c_k \mid \text{diseased}) = S(c_k \mid x) = \prod_{t=1}^{k-1} \big(1 - \lambda(c_t \mid x)\big), \]
where \(\lambda(c_t \mid x) = 1 - \exp\big(-\exp(\beta_{1t})\big)\) is the discrete hazard from the GLMM, and \(\beta_{1t}\) represents the fixed effect associated with threshold \(c_t\) for diseased individuals. Equivalently,
\[ \text{sens}(c_k \mid x) = \prod_{t=1}^{k-1} \exp\big(-\exp(\beta_{1t})\big). \]
The specificity at threshold \(c_k\) for nondiseased individuals is the probability of having a test value below the threshold:
\[ \text{spec}(c_k \mid x) = P(v \le c_k \mid \text{nondiseased}) = 1 - S(c_k \mid x), \]
with the discrete hazard for nondiseased individuals given by \(\lambda(c_t \mid x) = 1 - \exp\big(-\exp(\beta_{1t} + \beta_{2t})\big)\), where \(\beta_{2t}\) encodes the interaction between the threshold and nondiseased status:
\[ \text{spec}(c_k \mid x) = 1 - \prod_{t=1}^{k-1} \big(1 - \lambda(c_t \mid x)\big) = 1 - \prod_{t=1}^{k-1} \exp\big(-\exp(\beta_{1t} + \beta_{2t})\big). \]
After fitting the GLMM, the estimated parameters \(\hat{\beta}_{1t}\) and \(\hat{\beta}_{2t}\) can be plugged into these formulas to compute the estimated sensitivity and specificity at each threshold.
From these pairs of sensitivity and specificity, the AUC can be obtained numerically, e.g., using the trapezoidal rule.
Application in MetaROC
Inside the package, the model can be fitted using either the
fit_metaROC() function or the
metaROC.metaROC() method with
action = "estimate". The choice of link function is
specified through the model argument: use
model = "stoye2024cloglog" for the cloglog link, or
model = "stoye2024logit" to apply the logit link.
The cloglog link is generally preferred when events are rare or very common, while the logit link provides a standard logistic regression formulation for the discrete hazard.
library(metaROC)
set.seed(7)
data(hba1c)
fit_clog <- fit_metaROC(hba1c, model = "stoye2024cloglog")## Hello and welcome to metaROC!
## Requested model: stoye2024cloglog
## This is a GLMM for the discrete Hazard for multiple thresholds.
## See https://doi.org/10.1002/jrsm.1753 for more details.
## Warning in fit_metaROC_stoye2024dGLMM(data = data, model = "cloglog", nAGQ =
## nAGQ): NAs introduced by coercion
est_clog <- metaROC(action ="estimate", data = hba1c, model = "stoye2024cloglog")## Hello and welcome to metaROC!
## Chosen action: estimate
## Hello and welcome to metaROC!
## Requested model: stoye2024cloglog
## This is a GLMM for the discrete Hazard for multiple thresholds.
## See https://doi.org/10.1002/jrsm.1753 for more details.
## Warning in fit_metaROC_stoye2024dGLMM(data = data, model = "cloglog", nAGQ =
## nAGQ): NAs introduced by coercion
fit_logit <- fit_metaROC(hba1c, model = "stoye2024logit")## Hello and welcome to metaROC!
## Requested model: stoye2024logit
## This is a GLMM for the discrete Hazard for multiple thresholds.
## See https://doi.org/10.1002/jrsm.1753 for more details.
## Warning in fit_metaROC_stoye2024dGLMM(data = data, model = "logit", nAGQ =
## nAGQ): NAs introduced by coercion
est_logit <- metaROC(action ="estimate", data = hba1c, model = "stoye2024logit")## Hello and welcome to metaROC!
## Chosen action: estimate
## Hello and welcome to metaROC!
## Requested model: stoye2024logit
## This is a GLMM for the discrete Hazard for multiple thresholds.
## See https://doi.org/10.1002/jrsm.1753 for more details.
## Warning in fit_metaROC_stoye2024dGLMM(data = data, model = "logit", nAGQ =
## nAGQ): NAs introduced by coercion
Internally, the model is fitted by either lme4::glmer()
or GLMMadaptive::mixed_model(), depending on the chosen
nAGQ setting.
For
nAGQ < 2, the function useslme4::glmer()to fit a binomial GLMM with either a cloglog or logit link. Optimization is performed primarily via the bobyqa algorithm, with Nelder-Mead as a fallback. The fixed and random effects are estimated through penalized quasi-likelihood, and convergence is checked using theglmerconvergence diagnostics.For
nAGQ ≥ 2, the function usesGLMMadaptive::mixed_model()to fit the same model using adaptive Gaussian quadrature to integrate over the random effects. This approach provides more accurate likelihood estimates but can be computationally intensive and sometimes less stable.
The fitting procedure includes safeguards to exclude thresholds with numerically zero lower confidence bounds. Once the optimizer converges, the function returns the estimated coefficients, adjusted standard errors (depending on the link function), study-specific random effects, fitted model object, and the thresholds included in the final fit.
summary(fit_clog, ci_type = "wald")##
## *** Results of Multiple threshold method (MTM) ***
##
## Model: GLMM for multiple thresholds using the discrete hazard
## Link: cloglog
##
## Total number of studies: 38
## Total number of thresholds: 124
## Number of different thresholds: 26
##
## Results with Wald confidence intervals:
##
## Youden index (sensitivity weight = 0.5): 0.5610
## Optimal threshold value: 5.5000
##
## Estimated Sensitivity and Specificity [95% CI]:
## Sens: 0.6457 [0.4917; 0.7627]
## Spec: 0.9152 [0.6727; 0.9966]
##
## AUC: 0.8495
Firstly, the model name is reported, along with a reminder that the
Stoye et al. (2024) GLMM is a
multiple-threshold approach and that the cloglog link was used in
fit_clogto model the discrete hazard of test outcomes.
Afterwards, the summary provides a basic overview of the
meta-analysis.
The output then presents the optimal threshold, defined as the threshold that maximizes the Youden index, together with the corresponding value of the Youden index. In addition, the estimated sensitivity and specificity at this threshold are reported, along with their respective 95% confidence intervals. Finally, the AUC is provided as a summary measure of the overall diagnostic accuracy.
summary(fit_logit, ci_type = "wald")##
## *** Results of Multiple threshold method (MTM) ***
##
## Model: GLMM for multiple thresholds using the discrete hazard
## Link: logit
##
## Total number of studies: 38
## Total number of thresholds: 124
## Number of different thresholds: 26
##
## Results with Wald confidence intervals:
##
## Youden index (sensitivity weight = 0.5): 0.5840
## Optimal threshold value: 5.5000
##
## Estimated Sensitivity and Specificity [95% CI]:
## Sens: 0.6957 [0.4343; 0.8565]
## Spec: 0.8883 [0.4768; 0.9983]
##
## AUC: 0.8557
When examining the fit_logit results, we see that the
logit link was used for modeling. The summary output provides the same
information as the cloglog model, and the same optimal threshold was
identified. However, sensitivity and specificity estimates differ, and
their confidence intervals are notably wider. The AUC remains similar to
that from the cloglog link, though slightly higher.
This suggests that while both link functions agree on the optimal threshold, the logit link introduces greater uncertainty in the estimates, likely due to its less flexible handling of extreme probabilities in interval-censored data. The slightly higher AUC reflects differences in the estimated ROC curve shape.
## threshold sens_lo sensitivity sens_up spec_lo specificity spec_up
## 1 4.8 0.9712970 0.9884242 0.9953559 0.01530039 0.08768839 0.4208815
## 2 5.0 0.9361892 0.9665951 0.9819626 0.06309192 0.18763331 0.5581795
## 3 5.1 0.8895904 0.9371362 0.9637099 0.13284289 0.32203151 0.7105043
## 4 5.2 0.8223257 0.8927621 0.9352954 0.23383148 0.48603828 0.8441859
## 5 5.3 0.7270338 0.8265671 0.8912940 0.37241547 0.66590498 0.9385123
## 6 5.4 0.6142968 0.7439839 0.8345690 0.52118393 0.81406261 0.9827196
## 7 5.5 0.4916621 0.6457367 0.7626818 0.67268202 0.91522798 0.9965865
## 8 5.6 0.3596210 0.5295921 0.6725820 0.80318529 0.97061692 0.9996244
## 9 5.7 0.2424239 0.4121372 0.5734595 0.89766131 0.99239221 0.9999770
## 10 5.8 0.1446729 0.2972016 0.4661999 0.95345563 0.99853287 0.9999993
## youden_index
## 1 0.07611262
## 2 0.15422838
## 3 0.25916769
## 4 0.37880038
## 5 0.49247210
## 6 0.55804649
## 7 0.56096469
## 8 0.50020902
## 9 0.40452943
## 10 0.29573444
The summary SROC is computed by applying the model formulas from the Model specification section to the extracted parameter estimates. For each threshold, the sensitivity and specificity are calculated using the inverse link functions with the fitted fixed-effect coefficients. Confidence intervals are obtained by adjusting the estimates with the standard errors and the normal quantiles corresponding to the desired confidence level. The resulting pairs of sensitivity and specificity are used to assemble the SROC curve, and the weighted Youden index is calculated for each threshold. Finally, the AUC is computed numerically from the SROC points.
Simulation
For the stoye2024dGLMM model, data simulation is
performed at the study and threshold level to mimic the structure of
real diagnostic test accuracy studies. Individuals are grouped by study,
disease status, and diagnostic thresholds. Study-specific bivariate
random effects are drawn for the diseased and nondiseased groups,
accounting for the correlation between them. These random effects are
combined with threshold-specific fixed effects and group indicators to
construct the linear predictor of the GLMM. The linear predictor is then
mapped to the probability scale using either the cloglog or logit link,
yielding the discrete hazard for each threshold. True positives and true
negatives are sampled from binomial distributions using these discrete
hazards as probabilities, with safeguards to ensure that true positives
do not increase and true negatives do not decrease as thresholds
increase, preserving the monotonicity expected in diagnostic tests.
Finally, false positives and false negatives are derived from these
counts, producing simulated contingency tables (TP,
FN, FP, TN) that resemble the
interval-censored data typically reported in diagnostic accuracy
studies.
For the simulation examples, we consider two scenarios corresponding to the two link functions. In the first scenario, using the cloglog link, we simulate data based on predefined model parameters. In the second scenario, using the logit link, we set the parameters such that the underlying true AUC is 0.85. For further details on the input parameters and the simulation procedure, please refer to the simulation guide.
control_sim_cloglog = list(n_studies_min = 5, n_studies_max = 10,
n_subj_min = 20, n_subj_max = 500,
n_thresholds_min = 1, n_thresholds_max = 4,
thresholds_lower_bound = 5, thresholds_upper_bound = 7,
true_preval_min = 0.1, true_preval_max = 0.3,
model = "stoye2024cloglog",
model_pars = list(beta11 = -38.2, beta12 = 10.26,
beta13 = -0.669,
beta21 = 3.516, beta22 = -0.306,
var_rand_H = 0.6561, var_rand_D = 0.9216))
sim_stoye_cloglog <- simulate_metaROC(type = "phaseII", n_sims = 5,
control_sim = control_sim_cloglog)
sim_stoye_cloglog## $data
## # A tibble: 89 × 7
## replicate study threshold TP FN FP TN
## <int> <fct> <fct> <int> <int> <int> <int>
## 1 1 1 5.8 21 117 0 348
## 2 1 1 6.7 0 138 0 348
## 3 1 2 6 17 36 6 375
## 4 1 2 6.6 4 49 0 381
## 5 1 2 6.7 2 51 0 381
## 6 1 3 5.9 38 42 19 396
## 7 1 4 5.8 13 1 31 77
## 8 1 4 5.9 11 3 19 89
## 9 1 4 6.5 5 9 0 108
## 10 1 4 6.6 5 9 0 108
## # ℹ 79 more rows
##
## $data_info
## $data_info$model_id
## [1] "stoye2024cloglog"
##
## $data_info$true_sens
## [1] 9.737029e-01 9.375570e-01 8.891743e-01 8.263907e-01 7.478580e-01
## [6] 6.538353e-01 5.470146e-01 4.330258e-01 3.201362e-01 2.177847e-01
## [11] 1.341150e-01 7.345648e-02 3.513149e-02 1.440224e-02 4.971278e-03
## [16] 1.421436e-03 3.320754e-04 6.274557e-05 9.535363e-06 1.165283e-06
## [21] 1.151997e-07
##
## $data_info$true_spec
## [1] 0.1764855 0.3696620 0.5616079 0.7305545 0.8584556 0.9388991 0.9792862
## [8] 0.9947594 0.9990644 0.9998889 0.9999918 0.9999996 1.0000000 1.0000000
## [15] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
##
## $data_info$true_AUC
## [1] 0.8697526
##
## $data_info$model_pars
## $data_info$model_pars$beta11
## [1] -38.2
##
## $data_info$model_pars$beta12
## [1] 10.26
##
## $data_info$model_pars$beta13
## [1] -0.669
##
## $data_info$model_pars$beta21
## [1] 3.516
##
## $data_info$model_pars$beta22
## [1] -0.306
##
## $data_info$model_pars$var_rand_H
## [1] 0.6561
##
## $data_info$model_pars$var_rand_D
## [1] 0.9216
##
## $data_info$model_pars$betas1
## [1] -3.62500 -3.27469 -2.93776 -2.61421 -2.30404 -2.00725 -1.72384 -1.45381
## [9] -1.19716 -0.95389 -0.72400 -0.50749 -0.30436 -0.11461 0.06176 0.22475
## [17] 0.37436 0.51059 0.63344 0.74291 0.83900
##
## $data_info$model_pars$betas2
## [1] 1.9860 1.9554 1.9248 1.8942 1.8636 1.8330 1.8024 1.7718 1.7412 1.7106
## [11] 1.6800 1.6494 1.6188 1.5882 1.5576 1.5270 1.4964 1.4658 1.4352 1.4046
## [21] 1.3740
##
## $data_info$model_pars$all_available_thresholds
## [1] 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
## [20] 6.9 7.0
control_sim_logit = list(n_studies_min = 5, n_studies_max = 10,
n_subj_min = 20, n_subj_max = 500,
n_thresholds_min = 1, n_thresholds_max = 4,
thresholds_lower_bound = 5, thresholds_upper_bound = 7,
true_preval_min = 0.1, true_preval_max = 0.3,
model = "stoye2024cloglog",
model_pars = list(preset = TRUE, AUC = "85"))
sim_stoye_logit <- simulate_metaROC(type = "phaseII", n_sims = 5,
control_sim = control_sim_logit)
sim_stoye_logit## $data
## # A tibble: 98 × 7
## replicate study threshold TP FN FP TN
## <int> <fct> <fct> <int> <int> <int> <int>
## 1 1 1 6.3 14 8 3 47
## 2 1 2 6.5 3 14 0 82
## 3 1 3 6.5 1 0 0 20
## 4 1 4 5.3 19 1 13 31
## 5 1 4 6 15 5 0 44
## 6 1 5 6.4 0 50 0 143
## 7 1 6 5.5 15 5 41 144
## 8 1 6 6.2 5 15 0 185
## 9 1 7 6.6 6 77 0 395
## 10 1 8 5.6 4 7 2 43
## # ℹ 88 more rows
##
## $data_info
## $data_info$model_id
## [1] "stoye2024cloglog"
##
## $data_info$true_sens
## [1] 9.737029e-01 9.375570e-01 8.891743e-01 8.263907e-01 7.478580e-01
## [6] 6.538353e-01 5.470146e-01 4.330258e-01 3.201362e-01 2.177847e-01
## [11] 1.341150e-01 7.345648e-02 3.513149e-02 1.440224e-02 4.971278e-03
## [16] 1.421436e-03 3.320754e-04 6.274557e-05 9.535363e-06 1.165283e-06
## [21] 1.151997e-07
##
## $data_info$true_spec
## [1] 0.1524999 0.3251492 0.5047602 0.6729009 0.8110102 0.9076285 0.9632540
## [8] 0.9886084 0.9973761 0.9995732 0.9999535 0.9999968 0.9999999 1.0000000
## [15] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
##
## $data_info$true_AUC
## [1] 0.8499734
##
## $data_info$model_pars
## $data_info$model_pars$preset
## [1] TRUE
##
## $data_info$model_pars$beta11
## [1] -38.2
##
## $data_info$model_pars$beta12
## [1] 10.26
##
## $data_info$model_pars$beta13
## [1] -0.669
##
## $data_info$model_pars$beta21
## [1] 3.356
##
## $data_info$model_pars$beta22
## [1] -0.306
##
## $data_info$model_pars$var_rand_H
## [1] 0.6561
##
## $data_info$model_pars$var_rand_D
## [1] 0.9216
##
## $data_info$model_pars$betas1
## [1] -3.62500 -3.27469 -2.93776 -2.61421 -2.30404 -2.00725 -1.72384 -1.45381
## [9] -1.19716 -0.95389 -0.72400 -0.50749 -0.30436 -0.11461 0.06176 0.22475
## [17] 0.37436 0.51059 0.63344 0.74291 0.83900
##
## $data_info$model_pars$betas2
## [1] 1.8260 1.7954 1.7648 1.7342 1.7036 1.6730 1.6424 1.6118 1.5812 1.5506
## [11] 1.5200 1.4894 1.4588 1.4282 1.3976 1.3670 1.3364 1.3058 1.2752 1.2446
## [21] 1.2140
##
## $data_info$model_pars$all_available_thresholds
## [1] 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
## [20] 6.9 7.0
This concludes the discussion of the Stoye et al. (2024) GLMM. To gain an overview of how to plot a fitted model and how to conduct simulation studies, particularly for evaluating models such as the Stoye et al. (2024) GLMM, please refer to the other vignettes included in this package, which provide more detailed guidance on these topics.