Pooling and Selection of Cox Regression Models

Martijn W Heymans

2023-06-16

Introduction

With the psfmi package you can pool Cox regression models by using
the following pooling methods: RR (Rubin’s Rules), D1, D2, and MPR (Median R Rule). You can also use forward or backward selection from the pooled model. This vignette show you examples of how to apply these procedures.

Examples

Pooling without BW and method D1

If you set p.crit at 1 than no selection of variables takes place. Either using direction = “FW” or direction = “BW” will produce the same result.


  library(psfmi)
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Radiation + Onset + 
                Function + Age + Previous + Tampascale + JobControl + 
                JobDemand + Social + factor(Expect_cat), p.crit=1,
                method="D1", direction = "BW")
  
  pool_coxr$RR_model
#> $`Step 1 - no variables removed -`
#>                   term     estimate   std.error  statistic       df    p.value
#> 1             Duration -0.007738082 0.003989357 -1.9396814 178.7494 0.05399243
#> 2            Radiation -0.074924387 0.153336031 -0.4886287 178.9424 0.62570287
#> 3                Onset -0.093736647 0.175892468 -0.5329202 179.0136 0.59474977
#> 4             Function  0.043959177 0.016917684  2.5984159 175.2854 0.01016263
#> 5                  Age -0.008853266 0.007712097 -1.1479713 177.7066 0.25252344
#> 6             Previous -0.098349421 0.199199997 -0.4937220 178.0134 0.62211113
#> 7           Tampascale -0.023267973 0.014089917 -1.6513918 133.5549 0.10100803
#> 8           JobControl -0.008379513 0.008329969 -1.0059477 178.6831 0.31580101
#> 9            JobDemand -0.021840998 0.015512484 -1.4079626 178.3078 0.16088320
#> 10              Social -0.051348486 0.024912588 -2.0611462 178.3036 0.04074068
#> 11 factor(Expect_cat)2  0.243295714 0.231183479  1.0523923 177.6191 0.29404919
#> 12 factor(Expect_cat)3  0.227055353 0.200259380  1.1338063 178.6187 0.25839523
#>           HR lower.EXP upper.EXP
#> 1  0.9922918 0.9845108 1.0001342
#> 2  0.9278136 0.6855705 1.2556522
#> 3  0.9105225 0.6435046 1.2883376
#> 4  1.0449397 1.0106267 1.0804177
#> 5  0.9911858 0.9762151 1.0063861
#> 6  0.9063322 0.6117408 1.3427877
#> 7  0.9770006 0.9501492 1.0046109
#> 8  0.9916555 0.9754881 1.0080908
#> 9  0.9783958 0.9488992 1.0088093
#> 10 0.9499476 0.9043761 0.9978154
#> 11 1.2754457 0.8082175 2.0127772
#> 12 1.2548993 0.8452496 1.8630856
  pool_coxr$multiparm
#> $`Step 1 - no variables removed -`
#>                    p-values D1 F-statistic
#> Duration           0.052418576   3.7623639
#> Radiation          0.625104588   0.2387580
#> Onset              0.594088831   0.2840039
#> Function           0.009371567   6.7517650
#> Age                0.250982797   1.3178380
#> Previous           0.621503127   0.2437614
#> Tampascale         0.099111558   2.7270949
#> JobControl         0.314440908   1.0119309
#> JobDemand          0.159143091   1.9823588
#> Social             0.039289895   4.2483236
#> factor(Expect_cat) 0.485351826   0.7228839

Back to Examples

Pooling with FW and method MPR


  library(psfmi)
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Radiation + Onset + 
                Function + Age + Previous + Tampascale + JobControl + 
                JobDemand + Social + factor(Expect_cat), p.crit=0.05,
                method="D1", direction = "FW")
#> Entered at Step 1 is - Function
#> Entered at Step 2 is - Social
#> Entered at Step 3 is - Duration
#> 
#> Selection correctly terminated, 
#> No new variables entered the model
  
  pool_coxr$RR_model_final
#> $`Final model`
#>       term     estimate   std.error statistic       df      p.value        HR
#> 1 Duration -0.007605055 0.003742491 -2.032084 188.0122 0.0435521399 0.9924238
#> 2 Function  0.056490633 0.015414714  3.664722 188.0122 0.0003219893 1.0581167
#> 3   Social -0.051992380 0.022611457 -2.299382 188.0122 0.0225821316 0.9493361
#>   lower.EXP upper.EXP
#> 1 0.9851240 0.9997776
#> 2 1.0264257 1.0907861
#> 3 0.9079217 0.9926396
  pool_coxr$multiparm_final
#> $`Step 2 - selected - Duration`
#>                    p-value D1
#> Duration            0.0421452
#> Radiation           0.4898252
#> Onset               0.5294033
#> Age                 0.2989933
#> Previous            0.7136713
#> Tampascale          0.1555227
#> JobControl          0.5565327
#> JobDemand           0.1341280
#> factor(Expect_cat)  0.6103930
  pool_coxr$predictors_in
#>          Duration Radiation Onset Function Age Previous Tampascale JobControl
#> Step 1          0         0     0        1   0        0          0          0
#> Step 2          0         0     0        0   0        0          0          0
#> Step 3          1         0     0        0   0        0          0          0
#> Included        1         0     0        1   0        0          0          0
#>          JobDemand Social factor(Expect_cat)
#> Step 1           0      0                  0
#> Step 2           0      1                  0
#> Step 3           0      0                  0
#> Included         0      1                  0

Back to Examples

Pooling with FW including interaction terms and method D1

Pooling Cox regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D1 including interaction terms with a categorical predictor and forcing the predictor Tampascale in the models during backward selection.


  library(psfmi)
  
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Radiation + Onset + 
                Function + Age + Previous + Tampascale + factor(Expect_cat) +
                factor(Satisfaction) + Tampascale:Radiation + 
                factor(Expect_cat):Tampascale, keep.predictors = "Tampascale",
                p.crit=0.05, method="D1", direction = "FW")
#> Entered at Step 1 is - Function
#> Entered at Step 2 is - Duration
#> 
#> Selection correctly terminated, 
#> No new variables entered the model
  
  pool_coxr$RR_model_final
#> $`Final model`
#>         term     estimate   std.error statistic       df     p.value        HR
#> 1   Duration -0.008309329 0.003753484 -2.213765 187.9452 0.028047868 0.9917251
#> 2 Tampascale -0.016998788 0.013360456 -1.272321 177.6436 0.204921997 0.9831449
#> 3   Function  0.050217190 0.016427330  3.056930 187.0360 0.002563622 1.0514994
#>   lower.EXP upper.EXP
#> 1 0.9844091 0.9990955
#> 2 0.9575624 1.0094108
#> 3 1.0179701 1.0861332
  pool_coxr$multiparm_final
#> $`Step 1 - selected - Duration`
#>                               p-value D1
#> Duration                      0.02684499
#> Radiation                     0.49765169
#> Onset                         0.47699469
#> Age                           0.18893166
#> Previous                      0.46140973
#> factor(Expect_cat)            0.57323599
#> factor(Satisfaction)          0.78635590
#> Radiation*Tampascale          0.23337740
#> Tampascale*factor(Expect_cat) 0.34684004
  pool_coxr$predictors_in
#>          Duration Radiation Onset Function Age Previous Tampascale
#> Step 1          0         0     0        1   0        0          1
#> Step 2          1         0     0        0   0        0          1
#> Included        1         0     0        1   0        0          1
#>          factor(Expect_cat) factor(Satisfaction) Radiation*Tampascale
#> Step 1                    0                    0                    0
#> Step 2                    0                    0                    0
#> Included                  0                    0                    0
#>          Tampascale*factor(Expect_cat)
#> Step 1                               0
#> Step 2                               0
#> Included                             0

Back to Examples

Pooling with BW including spline coefficients and method D1

Pooling Cox regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D1 including a restricted cubic spline predictor and forcing Tampascale in the models during backward selection.


  library(psfmi)
  
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Radiation + Onset + 
                Function + Previous + rcs(Tampascale, 3) + 
                factor(Satisfaction) + rcs(Tampascale, 3):Radiation,  
                keep.predictors = "Tampascale",
                p.crit=0.05, method="D1", direction = "BW")
#> Removed at Step 1 is - factor(Satisfaction)
#> Removed at Step 2 is - Radiation*rcs(Tampascale,3)
#> Removed at Step 3 is - Onset
#> Removed at Step 4 is - Radiation
#> Removed at Step 5 is - Previous
#> 
#> Selection correctly terminated, 
#> No more variables removed from the model
  
  pool_coxr$RR_model_final
#> $`Step 6`
#>                            term    estimate  std.error statistic       df
#> 1                      Duration -0.00834864 0.00375033 -2.226108 186.9008
#> 2                      Function  0.05529708 0.01668254  3.314668 184.6679
#> 3  rcs(Tampascale, 3)Tampascale -0.06563800 0.02724068 -2.409558 129.2769
#> 4 rcs(Tampascale, 3)Tampascale'  0.05942661 0.02982168  1.992732 106.0096
#>       p.value        HR lower.EXP upper.EXP
#> 1 0.027201742 0.9916861 0.9843763 0.9990502
#> 2 0.001104361 1.0568545 1.0226366 1.0922174
#> 3 0.017380618 0.9364698 0.8873345 0.9883259
#> 4 0.048860903 1.0612279 1.0003023 1.1258642
  pool_coxr$multiparm_final
#> $`Step 6`
#>                    p-values D1 F-statistic
#> Duration          0.0260069717    4.955557
#> Function          0.0009181754   10.987023
#> rcs(Tampascale,3) 0.0522892728    2.971050
  pool_coxr$predictors_in
#> # A tibble: 3 × 1
#>   value            
#>   <chr>            
#> 1 Duration         
#> 2 Function         
#> 3 rcs(Tampascale,3)

Back to Examples

Pooling with FW including spline coefficients and method MPR

Pooling Cox regression models over 5 imputed datasets with forward selection using a p-value of 0.05 and as method MPR including a restricted cubic spline predictor and forcing Tampascale in the models during forward selection.


  library(psfmi)
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Radiation + Onset + 
                Function + Previous + rcs(Tampascale, 3) + 
                factor(Satisfaction) + rcs(Tampascale, 3):Radiation,  
                keep.predictors = "Tampascale",
                p.crit=0.05, method="MPR", direction = "FW")
#> Entered at Step 1 is - Function
#> Entered at Step 2 is - Duration
#> 
#> Selection correctly terminated, 
#> No new variables entered the model
  
  pool_coxr$RR_model_final
#> $`Final model`
#>                            term    estimate  std.error statistic       df
#> 1                      Duration -0.00834864 0.00375033 -2.226108 186.9008
#> 2  rcs(Tampascale, 3)Tampascale -0.06563800 0.02724068 -2.409558 129.2769
#> 3 rcs(Tampascale, 3)Tampascale'  0.05942661 0.02982168  1.992732 106.0096
#> 4                      Function  0.05529708 0.01668254  3.314668 184.6679
#>       p.value        HR lower.EXP upper.EXP
#> 1 0.027201742 0.9916861 0.9843763 0.9990502
#> 2 0.017380618 0.9364698 0.8873345 0.9883259
#> 3 0.048860903 1.0612279 1.0003023 1.1258642
#> 4 0.001104361 1.0568545 1.0226366 1.0922174
  pool_coxr$multiparm_final
#> $`Step 1 - selected - Duration`
#>                                P-value
#> Duration                    0.02280326
#> Radiation                   0.46220293
#> Onset                       0.41857220
#> Previous                    0.64851606
#> factor(Satisfaction)        0.58075823
#> Radiation*rcs(Tampascale,3) 0.75488109
  pool_coxr$predictors_in
#>          Duration Radiation Onset Function Previous factor(Satisfaction)
#> Step 1          0         0     0        1        0                    0
#> Step 2          1         0     0        0        0                    0
#> Included        1         0     0        1        0                    0
#>          rcs(Tampascale,3) Radiation*rcs(Tampascale,3)
#> Step 1                   1                           0
#> Step 2                   1                           0
#> Included                 1                           0

Back to Examples

Pooling with BW for a stratified Cox model

Pooling Cox regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method MPR for a stratified Cox model.


  library(psfmi)
  pool_coxr <- psfmi_coxr(data=lbpmicox, nimp=5, impvar="Impnr", 
                formula = Surv(Time, Status) ~ Duration + Onset + 
                Function + Previous + rcs(Tampascale, 3) + 
                factor(Satisfaction) + strata(Radiation), 
                p.crit=0.05, method="MPR", direction = "BW")
#> Removed at Step 1 is - factor(Satisfaction)
#> Removed at Step 2 is - Onset
#> Removed at Step 3 is - Previous
#> 
#> Selection correctly terminated, 
#> No more variables removed from the model
  
  pool_coxr$RR_model_final
#> $`Step 4`
#>                            term     estimate   std.error statistic        df
#> 1                      Duration -0.008484749 0.003738762 -2.269401 186.85272
#> 2                      Function  0.055437188 0.017461177  3.174883 184.30915
#> 3  rcs(Tampascale, 3)Tampascale -0.064744168 0.027417178 -2.361445 124.61432
#> 4 rcs(Tampascale, 3)Tampascale'  0.061363612 0.030289959  2.025873  97.02152
#>       p.value        HR lower.EXP upper.EXP
#> 1 0.024386480 0.9915511 0.9842648 0.9988915
#> 2 0.001757088 1.0570026 1.0212095 1.0940503
#> 3 0.019754160 0.9373072 0.8878009 0.9895742
#> 4 0.045523544 1.0632855 1.0012474 1.1291675
  pool_coxr$multiparm_final
#> $`Step 4`
#>                   p-value MPR
#> Duration          0.024176044
#> Function          0.001305273
#> rcs(Tampascale,3) 0.037329200
  pool_coxr$formula_step
#> $`Step 1 - removal - factor(Satisfaction)`
#> Surv(Time, Status) ~ Duration + Onset + Function + Previous + 
#>     factor(Satisfaction) + rcs(Tampascale, 3) + strata(Radiation)
#> <environment: 0x00000182fa18a788>
#> 
#> $`Step 2 - removal - Onset`
#> Surv(Time, Status) ~ Duration + Onset + Function + Previous + 
#>     rcs(Tampascale, 3) + strata(Radiation)
#> <environment: 0x00000182fa18a788>
#> 
#> $`Step 3 - removal - Previous`
#> Surv(Time, Status) ~ Duration + Function + Previous + rcs(Tampascale, 
#>     3) + strata(Radiation)
#> <environment: 0x00000182fa18a788>
#> 
#> $`Step 4 - removal - ended`
#> Surv(Time, Status) ~ Duration + Function + rcs(Tampascale, 3) + 
#>     strata(Radiation)
#> <environment: 0x00000182fa18a788>

Back to Examples