Pooling performance measures across multiply imputed datasets

pool_performance Pooling performance measures for logistic and Cox regression models.

pool_performance(
  data,
  formula,
  nimp,
  impvar,
  plot.indiv,
  model_type = "binomial",
  cal.plot = TRUE,
  plot.method = "mean",
  groups_cal = 10
)

Arguments

data: Data frame with stacked multiple imputed datasets. The original dataset that contains missing values must be excluded from the dataset.
formula: A formula object to specify the model as normally used by glm or coxph. See details.
nimp: A numerical scalar. Number of imputed datasets. Default is 5.
impvar: A character vector. Name of the variable that distinguishes the imputed datasets.
plot.indiv: This argument is deprecated; please use plot.method instead.
model_type: If "binomial" (default), performance measures are calculated for logistic regression models, if "survival" for Cox regression models. See details.
cal.plot: If TRUE a calibration plot is generated. Default is TRUE. model_type must be "binomial".
plot.method: If "mean" one calibration plot is generated, first taking the mean of the linear predictor across the multiply imputed datasets (default), if "individual" the calibration plot of each imputed dataset is plotted, if "overlay" calibration plots from each imputed datasets are plotted in one figure.
groups_cal: A numerical scalar. Number of groups used on the calibration plot and. for the Hosmer and Lemeshow test. Default is 10. If the range of predicted probabilities. is low, less than 10 groups can be chosen, but not < 3.

Details

A typical formula object for logistic regression models has the form formula = Outcome ~ terms. For Cox regression models the formula object must be defined as Surv(time, status) ~ terms. For Cox models calibration curves can not be generated.

Examples

 perf <- pool_performance(data=lbpmilr, nimp=5, impvar="Impnr", 
 formula = Chronic ~ Gender + Pain + Tampascale + 
 Smoking + Function + Radiation + Age + factor(Carrying), 
 cal.plot=TRUE, plot.method="mean", 
 groups_cal=10, model_type="binomial")

 
 perf$ROC_pooled
#>                     95% Low C-statistic 95% Up
#> C-statistic (logit)  0.8005      0.8714 0.9197
 perf$R2_pooled
#> [1] 0.5060599