This vignette focuses on the model for meta-analysis of diagnostic test accuracy studies with multiple thresholds as presented by Steinhauser et al. (2016). The goal of this approach is to jointly analyze sensitivity and specificity across studies while explicitly incorporating the information from multiple reported thresholds within each study. In contrast to traditional bivariate models that use only a single sensitivity–specificity pair per study, this method models the relationship between test outcomes and thresholds directly, allowing all available data to contribute to the analysis.
Firstly, we introduce the model itself, explaining how sensitivity and specificity are linked to an underlying continuous test measurement and how the threshold-specific observations can be represented through parametric distributions for diseased and non-diseased populations. We then describe how between-study heterogeneity is incorporated via a mixed-effects framework, yielding parameters that characterize the overall diagnostic accuracy and variability across studies. These parameters can subsequently be used to derive the SROC curve and to identify optimal thresholds.
Afterwards, we focus on how the model can be fitted in practice and how the SROC curve and optimal threshold can be obtained, providing a brief overview of the computational steps involved in estimating the model parameters within a mixed-effects meta-analytic framework.
Model Specification
The approach by Steinhauser et al. (2016) models diagnostic test accuracy by explicitly incorporating multiple thresholds per study and linking them to an underlying continuous biomarker. The key idea is that sensitivity and specificity arise from the distribution of this biomarker in diseased and non-diseased individuals.
Let \(x\) denote a diagnostic threshold. The specificity \(Sp(x)\) corresponds to the probability of a negative test result among non-diseased individuals, while \(1 - Se(x)\) represents the probability of a negative test result among diseased individuals. These quantities can be interpreted as values of the CDFs) of the biomarker in the two groups.
To enable linear modeling, a transformation \(h(\cdot)\) is applied, such as the probit transformation \(\Phi^{-1}\) or the logit transformation. This yields the following linear relationships:
\[ h(Sp(x)) = \frac{x - \mu_0}{\sigma_0}, \]
\[ h(1 - Se(x)) = \frac{x - \mu_1}{\sigma_1}, \]
where \((\mu_0, \sigma_0^2)\) and \((\mu_1, \sigma_1^2)\) are the mean and variance parameters of the biomarker distribution for non-diseased and diseased individuals, respectively.
In this formulation, specificity and sensitivity across thresholds provide information about the full distribution of the biomarker in both groups, rather than a single operating point.
The transformed proportions are modeled using linear predictors that depend on the threshold. For study \(s\) and threshold \(x_{si}\), the fixed-effects part of the model is given by:
\[ h\!\left(\frac{TN_{si}}{n_{0s}}\right) = \alpha_0 + \beta_0 x_{si}, \]
\[ h\!\left(\frac{FN_{si}}{n_{1s}}\right) = \alpha_1 + \beta_1 x_{si}. \]
Here, \(\alpha_0, \alpha_1\) are intercepts and \(\beta_0, \beta_1\) are slopes for the non-diseased and diseased groups, respectively. These parameters determine the location and scale of the biomarker distributions.
In particular, the distribution parameters can be recovered from the regression coefficients as:
\[ \hat{\mu}_j = -\frac{\alpha_j}{\beta_j}, \qquad \hat{\sigma}_j = \frac{1}{\beta_j}, \quad j = 0,1. \]
Thus, the fixed effects directly encode the estimated biomarker distributions for both groups.
To account for between-study heterogeneity, the model is extended to a linear mixed-effects framework. The most general formulation includes study-specific random intercepts and slopes:
\[ h\!\left(\frac{TN_{si}}{n_{0s}}\right) = \alpha_0 + a_{0s} + (\beta_0 + b_{0s}) x_{si} + e_{si}, \]
\[ h\!\left(\frac{FN_{si}}{n_{1s}}\right) = \alpha_1 + a_{1s} + (\beta_1 + b_{1s}) x_{si} + f_{si}. \]
Here, \(a_{0s}, a_{1s}\) are random intercepts and \(b_{0s}, b_{1s}\) are random slopes for study \(s\). These random effects are assumed to follow a multivariate normal distribution with mean zero and an unstructured covariance matrix, allowing for correlations between intercepts and slopes across groups.
The residual errors are assumed to be normally distributed:
\[ e_{si} \sim N\!\left(0, \frac{\gamma^2}{w_{si}}\right), \qquad f_{si} \sim N\!\left(0, \frac{\gamma^2}{v_{si}}\right), \]
where \(w_{si}\) and \(v_{si}\) are weights, typically based on inverse variances or sample sizes.
This mixed-effects formulation allows the model to capture both within-study variability (across thresholds) and between-study heterogeneity in diagnostic accuracy.
SROC curve
Once the model parameters have been estimated, the SROC curve can be derived directly from the estimated biomarker distributions in the diseased and non-diseased groups. Specifically, the SROC curve is obtained by expressing sensitivity as a function of the false positive rate:
\[ ROC(t) = 1 - F_{\mu_1,\sigma_1}\big(F^{-1}_{\mu_0,\sigma_0}(1 - t)\big), \quad 0 \le t \le 1, \]
where \(F_{\mu,\sigma}\) denotes the cumulative distribution function of the biomarker with location \(\mu\) and scale \(\sigma\).
Conceptually, this construction maps a given false positive rate \(t\) to a threshold via the non-diseased distribution, and then evaluates the corresponding sensitivity using the diseased distribution. In this way, the SROC curve is fully determined by the two estimated biomarker distributions.
Optimal threshold
An optimal threshold can be determined using the weighted Youden index. Using the estimated model parameters, the Youden index can be written as:
\[ \hat{Y}_w(x) = \lambda_w \big(1 - 2 h^{-1}\big(\tfrac{x - \hat{\mu}_1}{\hat{\sigma}_1}\big)\big) + (1 - \lambda_w)\big(2 h^{-1}\big(\tfrac{x - \hat{\mu}_0}{\hat{\sigma}_0}\big) - 1\big). \]
The optimal threshold \(x_0\) is defined as the value of \(x\) that maximizes \(Y_w(x)\). Under the normal distribution assumption, a closed-form solution exists (depending on whether the variances are equal or not), while for other distributions such as the logistic case, the optimum is obtained numerically.
In practice, this threshold represents the point that provides the best trade-off between sensitivity and specificity according to the chosen weighting.
Model Application in MetaROC
Inside the package, the model can be fitted by either the
fit_metaROC() function or the
metaROC.metaROC() method when setting
action = "estimate". The model can then be fitted by
setting model = "steinhauser2016logitlmm":
library(metaROC)
set.seed(7)
data(hba1c)
fit_st <- fit_metaROC(hba1c, model = "steinhauser2016logitlmm")## Hello and welcome to metaROC!
## Requested model: steinhauser2016logitlmm
## This is an LMM for multiple thresholds.
## See https://doi.org/10.1186/s12874-016-0196-1 for more details.
est_st <- metaROC(action ="estimate", data = hba1c, model = "steinhauser2016logitlmm")## Hello and welcome to metaROC!
## Chosen action: estimate
## Hello and welcome to metaROC!
## Requested model: steinhauser2016logitlmm
## This is an LMM for multiple thresholds.
## See https://doi.org/10.1186/s12874-016-0196-1 for more details.
The model is fitted using the diagmeta::diagmeta()
function from the diagmeta package. This function
implements the parametric mixed-effects approach for diagnostic test
accuracy studies with multiple thresholds by fitting a linear
mixed-effects model to the transformed sensitivity and specificity
data.
Conceptually, the function takes all reported thresholds and corresponding \(TP\), \(FP\), \(TN\), and \(FN\) counts across studies and models the relationship between the threshold and the transformed probabilities of negative test results in diseased and non-diseased groups. It estimates fixed effects (intercepts and slopes) describing the underlying biomarker distributions, as well as random effects capturing between-study heterogeneity.
Estimation is carried out via restricted maximum likelihood (REML)
within a linear mixed-effects framework, using the optimization routines
provided by the lme4 package (specifically
lmer()). These routines iteratively optimize the likelihood
to obtain parameter estimates and variance components.
The result is a fitted model object containing the estimated regression parameters, variance components, and the thresholds used, which can subsequently be used to derive the SROC curve and optimal threshold.
summary(fit_st, ci_type = "wald")##
## *** Results of Multiple threshold method (MTM) ***
##
## Model: LMM for multiple thresholds
## Type of model: DICS
##
## Total number of studies: 38
## Total number of thresholds: 124
## Number of different thresholds: 26
##
## Results with Wald confidence intervals:
##
## Youden index (sensitivity weight = 0.5): 0.5007
## Optimal threshold value: 5.8403
##
## Estimated Sensitivity and Specificity [95% CI]:
## Sens: 0.6630 [0.5686; 0.7459]
## Spec: 0.8378 [0.7713; 0.8877]
##
## AUC: 0.7992
Firstly, the model name is reported, along with a reminder that the Steinhauser et al. (2016) model is a multiple-threshold approach. Afterwards, the summary provides a basic overview of the meta-analysis.
The output then presents the optimal threshold, defined as the threshold that maximizes the Youden index, together with the corresponding value of the Youden index. In addition, the estimated sensitivity and specificity at this threshold are reported, along with their respective 95% confidence intervals. Finally, the area under the curve (AUC) is provided as a summary measure of the overall diagnostic accuracy.
## threshold sens_lo sensitivity sens_up spec_lo specificity
## 1 3.510000 0.9943166 0.9980042 0.9993008 0.0002289209 0.0005812995
## 2 3.514855 0.9942640 0.9979811 0.9992911 0.0002338045 0.0005924079
## 3 3.519710 0.9942108 0.9979577 0.9992813 0.0002387920 0.0006037284
## 4 3.524565 0.9941572 0.9979340 0.9992713 0.0002438857 0.0006152651
## 5 3.529419 0.9941031 0.9979101 0.9992612 0.0002490878 0.0006270222
## 6 3.534274 0.9940485 0.9978859 0.9992509 0.0002544006 0.0006390037
## 7 3.539129 0.9939933 0.9978614 0.9992405 0.0002598265 0.0006512141
## 8 3.543984 0.9939377 0.9978367 0.9992300 0.0002653679 0.0006636576
## 9 3.548839 0.9938815 0.9978116 0.9992193 0.0002710272 0.0006763388
## 10 3.553694 0.9938248 0.9977863 0.9992084 0.0002768070 0.0006892621
## spec_up youden_index
## 1 0.001475295 -0.001414514
## 2 0.001500203 -0.001426520
## 3 0.001525532 -0.001438580
## 4 0.001551289 -0.001450695
## 5 0.001577482 -0.001462863
## 6 0.001604117 -0.001475082
## 7 0.001631203 -0.001487352
## 8 0.001658747 -0.001499672
## 9 0.001686757 -0.001512041
## 10 0.001715240 -0.001524457
The SROC curve is computed using the
diagmeta::diagstats() function, which evaluates the fitted
model at a set of specified thresholds. For each threshold, the function
calculates the corresponding sensitivity and specificity by applying the
inverse link function to the estimated linear predictors, thereby
obtaining values on the original probability scale.
Conceptually, this means that the SROC curve is not obtained as a fully continuous function, but rather as a collection of points corresponding to the observed (or user-specified) thresholds. These points are derived from the estimated biomarker distributions and reflect how sensitivity and specificity change across thresholds.
Simulation
Note: The parameterization of this model in our implementation has been changed. Using the parameters presented here does not reflect the new parameterization, hence the warnings. Do not use this model for data generation in the current version of this package!
For this model, it is possible to simulate individual-level data under the assumed logit-linear mixed-effects structure. Data generation is based on the underlying biomarker distributions for diseased and non-diseased individuals. For each study, individual test values are simulated separately for both groups from normal distributions, where the means correspond to the fixed effects and the variances include study-specific random effects to reflect between-study heterogeneity.
Given these simulated biomarker values, multiple study-specific thresholds are applied to classify test results as positive or negative. For each threshold, the numbers of true positives, false positives, true negatives, and false negatives are then obtained by comparing the simulated test values to the threshold.
In this way, the simulated data mimic the structure assumed by the model: multiple thresholds per study, underlying continuous test measurements, and variability across studies driven by random effects.
control_sim_steinh = list(n_studies_min = 5, n_studies_max = 10,
n_subj_min = 20, n_subj_max = 500,
n_thresholds_min = 1, n_thresholds_max = 4,
thresholds_lower_bound = 5, thresholds_upper_bound = 7,
true_preval_min = 0.1, true_preval_max = 0.3,
model = "steinhauser2016logitlmm",
model_pars = list(beta11 = 2, beta12 = 0.5,
beta21 = 3, beta22 = 0.8,
var_rand_H = 0.6561, var_rand_D =0.9216))
sim_steinh <- simulate_metaROC(type = "phaseII", n_sims = 5,
control_sim = control_sim_steinh)## Warning in simulate_metaROC_from_model(n_studies_min =
## control_sim$n_studies_min, : logitLMM data generation does not work with
## changed parameterization!
## Warning in simulate_metaROC_from_model(n_studies_min =
## control_sim$n_studies_min, : logitLMM data generation does not work with
## changed parameterization!
## Warning in simulate_metaROC_from_model(n_studies_min =
## control_sim$n_studies_min, : logitLMM data generation does not work with
## changed parameterization!
## Warning in simulate_metaROC_from_model(n_studies_min =
## control_sim$n_studies_min, : logitLMM data generation does not work with
## changed parameterization!
## Warning in simulate_metaROC_from_model(n_studies_min =
## control_sim$n_studies_min, : logitLMM data generation does not work with
## changed parameterization!
sim_steinh## $data
## # A tibble: 99 × 7
## replicate study threshold TP FN FP TN
## <int> <int> <fct> <int> <int> <int> <int>
## 1 1 1 5.8 0 138 0 348
## 2 1 1 6.7 0 138 0 348
## 3 1 2 6 0 53 0 381
## 4 1 2 6.6 0 53 0 381
## 5 1 2 6.7 0 53 0 381
## 6 1 3 5.9 0 80 0 415
## 7 1 4 5.8 0 14 0 108
## 8 1 4 5.9 0 14 0 108
## 9 1 4 6.5 0 14 0 108
## 10 1 4 6.6 0 14 0 108
## # ℹ 89 more rows
##
## $data_info
## $data_info$model_id
## [1] "steinhauser2016logitlmm"
##
## $data_info$true_sens
## [1] 1.267366e-02 9.440520e-03 6.953148e-03 5.063270e-03 3.645179e-03
## [6] 2.594304e-03 1.825217e-03 1.269343e-03 8.725593e-04 5.928487e-04
## [11] 3.981151e-04 2.642246e-04 1.733097e-04 1.123422e-04 7.196493e-05
## [16] 4.555581e-05 2.849706e-05 1.761484e-05 1.075893e-05 6.493246e-06
## [21] 3.872108e-06
##
## $data_info$true_spec
## [1] 0.9999890 0.9999942 0.9999970 0.9999985 0.9999992 0.9999996 0.9999998
## [8] 0.9999999 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
## [15] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
##
## $data_info$true_AUC
## [1] 0.5063313
##
## $data_info$model_pars
## $data_info$model_pars$beta11
## [1] 2
##
## $data_info$model_pars$beta12
## [1] 0.5
##
## $data_info$model_pars$beta21
## [1] 3
##
## $data_info$model_pars$beta22
## [1] 0.8
##
## $data_info$model_pars$var_rand_H
## [1] 0.6561
##
## $data_info$model_pars$var_rand_D
## [1] 0.9216
##
## $data_info$model_pars$all_available_thresholds
## [1] 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
## [20] 6.9 7.0
This concludes the discussion of the Steinhauser et al. (2016) LMM. To gain an overview of how to plot a fitted model and how to conduct simulation studies, particularly for evaluating models such as the Steinhauser et al. (2016) LMM, please refer to the other vignettes included in this package, which provide more detailed guidance on these topics.