Dusseldorp et al., 2016 QUINT Replication

Elise Dusseldorp

2022-07-01

Loading package and data

Check that data aligns

bcrp[1:3, c(1:6,14)]
##     physt1 cesdt1   physt3 cesdt3 negsoct1 uncomt1 cond
## 1 37.65374     14 52.62905      4        9      28    3
## 2 53.64822     10 51.18797     14        7      36    1
## 3 63.84140      8 66.45392      9        6      29    2
ex_data <- subset(bcrp, cond < 3)

Formulas for QUINT commands

In the previous version there were two mistakes in formula 2 here (one mistake was the physt3 - physt3 and the other one is a typo from the paper: it should be physt1 as predictor instead of cesdt1). This typo however gave in this case the same result as using physt1.

formula1 <- I(cesdt1-cesdt3)~cond|cesdt1+negsoct1+uncomt1+disopt1+comorbid+age+wcht1+nationality+marital+trext

#Old formula2 <- I(physt3-physt3)~cond|cesdt1+negsoct1+uncomt1+disopt1+comorbid+age+wcht1+nationality+marital+trext
##New formula 2
formula2 <- I(physt3-physt1)~cond|physt1+negsoct1+uncomt1+disopt1+comorbid+age+wcht1+nationality+marital+trext

Quint model for cesdt difference (replicates)

When you choose a different seed (see below) the result is the same as in the BRM article. In R version 4 the random number generator has been changed. This might give you the feeling that results are depending on the seed, and thus unstable. The stability of the solution can be improved when we increase the number of bootstrap samples (see below the example with control6)

set.seed(4717) 
quint1 <- quint(formula1, data = ex_data)
summary(quint1)
## Warning in summary.quint(quint1): This tree is the initial one and it is not
## pruned. Thus, the qualitative interaction condition has not been checked yet. We
## advise to prune the tree and then summarize its information.
## Partitioning criterion: Effect size criterion 
##  
## Fit information: 
##                Criterion 
##                 - - - - 
##  split #leaves apparent biascorrected   se
##      1       2     2.51          2.22 0.03
##      2       3     2.66          2.28 0.03
##      3       4     2.78          2.36 0.03
##      4       5     2.86          2.38 0.04
##      5       6     2.89          2.41 0.05
##      6       7     2.90          2.33 0.06
## 
## Split information: 
##         parentnode childnodes splittingvar splitpoint
## Split 1          1        2,3      disopt1      18.50
## Split 2          2        4,5     negsoct1       5.50
## Split 3          5      10,11        trext      -0.76
## Split 4          3        6,7      disopt1      21.50
## Split 5         11      22,23      uncomt1      28.50
## Split 6         23      46,47       cesdt1       7.00
## 
## Leaf information: 
##        #(T=1) meanY|T=1 SD|T=1 #(T=2) meanY|T=2 SD|T=2     d   se class
## Leaf 1     11      1.00   3.10      7      3.71   4.75 -0.71 0.54     2
## Leaf 2     14      4.07   4.55      8     -5.50   4.84  2.06 0.59     1
## Leaf 3      9      1.67   7.35     11      3.36   4.86 -0.28 0.48     2
## Leaf 4      9      1.11   1.90      9     -2.56   4.48  1.07 0.55     1
## Leaf 5     16      6.06   7.26     11      1.91   5.39  0.63 0.42     3
## Leaf 6     11     -1.09   2.74     17      1.65   2.94 -0.96 0.43     2
## Leaf 7      8      0.75   6.09      7      0.29   1.80  0.10 0.56     3
quint1$fi
##   split #leaves apparent biascorrected       opt         se   compdif compcard
## 1     1       2 2.508878      2.216474 0.2924035 0.02510283 0.5537120 1.955166
## 2     2       3 2.662751      2.284585 0.3781667 0.03174262 0.6700347 1.992717
## 3     3       4 2.782163      2.362743 0.4194197 0.03334662 1.1088803 1.673282
## 4     4       5 2.862181      2.380237 0.4819436 0.04287284 1.2544722 1.607709
## 5     5       6 2.886169      2.410823 0.4753462 0.04572113 0.9358392 1.950330
## 6     6       7 2.899296      2.327036 0.5722607 0.06432640 1.0688096 1.830487
quint1$si
##         parentnode childnodes splittingvar splitpoint truesplitpoint
## Split 1          1        2,3      disopt1 18.5000000          18.50
## Split 2          2        4,5     negsoct1  5.5000000           5.50
## Split 3          5      10,11        trext -0.7576506          -0.76
## Split 4          3        6,7      disopt1 21.5000000          21.50
## Split 5         11      22,23      uncomt1 28.5000000          28.50
## Split 6         23      46,47       cesdt1  7.0000000           7.00
quint1$li
##        node #(T=1) meanY|T=1   SD|T=1 #(T=2)  meanY|T=2   SD|T=2          d
## Leaf 1    4     11  1.000000 3.098387      7  3.7142857 4.750940 -0.7136858
## Leaf 2   10     14  4.071429 4.548276      8 -5.5000000 4.840307  2.0572337
## Leaf 3   22      9  1.666667 7.348469     11  3.3636364 4.863594 -0.2784485
## Leaf 4   46      9  1.111111 1.900292      9 -2.5555556 4.475241  1.0665296
## Leaf 5   47     16  6.062500 7.261485     11  1.9090909 5.393599  0.6313815
## Leaf 6    6     11 -1.090909 2.736953     17  1.6470588 2.935583 -0.9570573
## Leaf 7    7      8  0.750000 6.088631      7  0.2857143 1.799471  0.1002329
##               se class
## Leaf 1 0.5362530     2
## Leaf 2 0.5890799     1
## Leaf 3 0.4795359     2
## Leaf 4 0.5473016     1
## Leaf 5 0.4196000     3
## Leaf 6 0.4273913     2
## Leaf 7 0.5631033     3

Pruning

quint1pr <- prune(quint1)
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 2.508878 
## split 2 
## #leaves is 3 
## current value of C 2.662751 
## split 3 
## #leaves is 4 
## current value of C 2.782163 
## split 4 
## #leaves is 5
plot(quint1pr)

Result of pruning is the same as Figure 1 in paper

Quint model for physt difference

set.seed(48) #note that this is again a different random number due to R version 4 and higher
quint2 <- quint(formula = formula2, data = ex_data)
## Warning in quint(formula = formula2, data = ex_data): After split 5, the
## partitioning criterion cannot be computed in more than 10 percent of the
## bootstrap samples. The split is unstable. Consider reducing the maximum number
## of leaves using quint.control().

Pruning gives the same result as in the paper.

quint2$fi
##   split #leaves apparent biascorrected       opt         se   compdif compcard
## 1     1       2 2.416531      2.159822 0.2567094 0.02769424 0.5122472 1.904284
## 2     2       3 2.440250      2.007987 0.4322630 0.03286115 0.9582961 1.481954
## 3     3       4 2.634604      2.078552 0.5560526 0.05039910 0.6352842 1.999320
## 4     4       5 2.750414      2.143925 0.6064887 0.05622533 0.9307353 1.819678
## 5     5       6 2.828961      2.157154 0.6718063 0.06697405 0.8448054 1.984155
quint2pr <- prune(quint2) 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2
plot(quint2pr)

Using different tuning parameters

Changing Criterion

Using the new formula2 this works

control3 <- quint.control(crit = "dm")
set.seed(48)
quint3 <- quint(formula = formula2, data = ex_data, control = control3)
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## Bootstrap sample  11 
## Bootstrap sample  12 
## Bootstrap sample  13 
## Bootstrap sample  14 
## Bootstrap sample  15 
## Bootstrap sample  16 
## Bootstrap sample  17 
## Bootstrap sample  18 
## Bootstrap sample  19 
## Bootstrap sample  20 
## Bootstrap sample  21 
## Bootstrap sample  22 
## Bootstrap sample  23 
## Bootstrap sample  24 
## Bootstrap sample  25 
## current value of C 3.077355 
## split 2 
## #leaves is 3 
## splitting process stopped after number of leaves equals 2 because new value of C was not higher than current value of C

Changing Max Leaves (Replicates)

control4 <- quint.control(maxl = 2)
set.seed(48)
quint4 <- quint(formula = formula1, data = ex_data, control = control4)
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## Bootstrap sample  11 
## Bootstrap sample  12 
## Bootstrap sample  13 
## Bootstrap sample  14 
## Bootstrap sample  15 
## Bootstrap sample  16 
## Bootstrap sample  17 
## Bootstrap sample  18 
## Bootstrap sample  19 
## Bootstrap sample  20 
## Bootstrap sample  21 
## Bootstrap sample  22 
## Bootstrap sample  23 
## Bootstrap sample  24 
## Bootstrap sample  25
round(quint4$li, digits = 2)
##        node #(T=1) meanY|T=1 SD|T=1 #(T=2) meanY|T=2 SD|T=2     d   se class
## Leaf 1    2     59      3.22   5.68     46      0.37   5.86  0.50 0.20     1
## Leaf 2    3     19     -0.32   4.41     24      1.25   2.69 -0.44 0.32     2

Changing minimum effect size for split

In version 2, we check the value of dmin after the pruning (thus not after the first split when growing the tree). The advantage is that we also allow trees with 3 or more splits that satisfy the dmin criteria (if we only check after one split we might miss a valuable qualitative interaction further down the tree). For this example, the final result here is the same: the tree is pruned back to the root node. It means that there is no qualitative interaction with a minimum effect size of 0.40 in one of the leaves assigned to P1 and -0.40 in one of the leaves assigned to P2.

control5 <- quint.control(crit = "dm", dmin = 0.40)
set.seed(48)
quint5 <- quint(formula = formula2 , data = ex_data, control = control5)
quint5pr <- prune(quint5)
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2
quint5pr$li
##        node #(T=1) meanY|T=1   SD|T=1 #(T=2) meanY|T=2   SD|T=2         d
## Leaf 1    0     78   3.52226 6.612303     70  5.890787 8.427688 -2.368527
##             se class
## Leaf 1 1.23892     2

Changing number of bootstraps to 50

Now we use the original seed of the paper (47) and increase the number of bootstrap samples to obtain more stable results. The default number of B = 25 was chosen to speed up the computations and based on the paper by LeBlanc, M., & Crowley, J. (1993).

set.seed(47) #this is the original seed of the first example in the paper
control6 <- quint.control(B = 50) #now we choose 50 bootstrap samples
quint6 <- quint(formula = formula1 , data = ex_data, control = control6)
## Warning in quint(formula = formula1, data = ex_data, control = control6): After
## split 5, the partitioning criterion cannot be computed in more than 10 percent
## of the bootstrap samples. The split is unstable. Consider reducing the maximum
## number of leaves using quint.control().
## Warning in quint(formula = formula1, data = ex_data, control = control6): After
## split 6, the partitioning criterion cannot be computed in more than 10 percent
## of the bootstrap samples. The split is unstable. Consider reducing the maximum
## number of leaves using quint.control().

This results again in the same Figure 1 as in the paper:

quint6$fi
##   split #leaves apparent biascorrected       opt         se   compdif compcard
## 1     1       2 2.508878      2.244627 0.2642509 0.02566082 0.5537120 1.955166
## 2     2       3 2.662751      2.295180 0.3675710 0.03017141 0.6700347 1.992717
## 3     3       4 2.782163      2.363180 0.4189829 0.02463731 1.1088803 1.673282
## 4     4       5 2.862181      2.417665 0.4445166 0.02301522 1.2544722 1.607709
## 5     5       6 2.886169      2.412628 0.4735408 0.02376043 0.9358392 1.950330
## 6     6       7 2.899296      2.410083 0.4892138 0.03886487 1.0688096 1.830487
quint6pr <- prune(quint6)
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 2.508878 
## split 2 
## #leaves is 3 
## current value of C 2.662751 
## split 3 
## #leaves is 4 
## current value of C 2.782163 
## split 4 
## #leaves is 5
plot(quint6pr)