--- title: "Guiding Oncology Dose-Escalation Trials" author: "Andrew Bean, Sebastian Weber, Lukas Widmer" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Guiding Oncology Dose-Escalation Trials} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ```{r, include=FALSE} library(OncoBayes2) library(knitr) ggplot2::theme_set(bayesplot::bayesplot_theme_get()) knitr::opts_chunk$set( fig.width = 1.62*4, fig.height = 4, warning=FALSE, message=FALSE ) ## setup up fast sampling when run on CRAN is_CRAN <- !identical(Sys.getenv("NOT_CRAN", "false"), "true") ## NOTE: for running this vignette locally, please uncomment the ## following line: ##is_CRAN <- FALSE .user_mc_options <- list() if (is_CRAN) { .user_mc_options <- options(OncoBayes2.MC.warmup=10, OncoBayes2.MC.iter=20, OncoBayes2.MC.chains=1, OncoBayes2.MC.save_warmup=FALSE, mc.cores=1) } else { .user_mc_options <- options(OncoBayes2.MC.warmup=500, OncoBayes2.MC.iter=1000, OncoBayes2.MC.chains=4, OncoBayes2.MC.save_warmup=FALSE, mc.cores=1) } ``` ## Introduction The `OncoBayes2` package provides flexible functions for Bayesian meta-analytic modeling of the incidence of Dose Limiting Toxicities (DLTs) by dose level, under treatment regimes involving any number of combination partners. Such models may be used to ensure patient safety during trial conduct by supporting dose-escalation decisions. In addition, the model can support estimation of the Maximum Tolerated Dose (MTD) in adaptive Bayesian dose-escalation designs. Whereas traditional dose escalation designs, such as the 3+3 design, base the dosing decisions on predefined rules about the number of DLTs in the last one or two cohorts at the current dose, model-based designs such as those using Bayesian Logistic Regression Models (BLRMs) endeavor to model the dose-toxicity relationship as a continuous curve, and allow the model to guide dosing decisions. In this way, all available data contributes to the dosing decisions. Furthermore, extensions to the BLRM approach can support inclusion of available additional data on the compound(s) involved. The additional data can be either historical data collected prior trial conduct or concurrent data, which is collected during trial conduct in the context of another trial/context. The package supports incorporation of additional data through a Meta-Analytic-Combined (MAC) framework [1]. Within the MAC model the heterogeneous sources of data are assigned to *groups* and information is shared across groups through a hierarchical model structure. For any given group this leads to borrowing strength from all other groups while discounting the information from other groups. The amount of discounting (or down-weighting) is determined by the heterogeneity. A group is commonly defined to be a trial, but that must not necessarily hold. The key assumption of the hierarchical model is the exchangability assumption between the groups. There are two independent mechanisms in the package which aim at relaxing the exchangability assumption: - Differential discounting: Groups are assigned to different strata. While the overall hierarchical mean stays the same, the heterogeneity between groups is allowed to be different between strata. Each group must be assigned to a single stratum only. - EXchangeable/Non-EXchangeable (EX/NEX) model for each group: With EXNEX each group is modelled as being exchangeable with some probability and is allowed to have it's own group-specific estimate as if the group is not exchangeable with the remainder of the data. Both techniques are rather advanced and are not discussed further in this introduction. In the following we illustrate first a very common use case of historical information only and then consider concurrent data in addition. In particular, we will discuss a trial evaluating a combination of two drugs whenever historical information is available on each drug individually from separate trials. This example will be expanded by using in addition concurrent data on one of the drugs and on their combination. **Note on terminology:** While in the literature (see [1], [2], and [4]) the term *stratum* refers to a trial commonly, `OncoBayes2` deviates here and uses the term *group* instead. This is more in line with hierarchical modeling terminology. The term *stratum* is used to define a higher level grouping structure. That is, every group is assigned to a single *stratum* within `OncoBayes2`. This higher level grouping (groups of groups) is necessary whenever differential discounting is used. By convention `OncoBayes2` assigns any group to the stratum "all" whenever no stratum is assigned for a group. ```{r, message = FALSE} ## Load involved packages library(dplyr) ## for mutate library(tidyr) ## defines expand_grid library(tibble) ## for tibbles library(ggplot2) ## for plotting ``` ## Example use-case: Dual combination trial with historical information Consider the application described in Section 3.2 of [1], in which the risk of DLT is to be studied as a function of dose for two drugs, drug A and drug B. Historical information on the toxicity profiles of these two drugs is available from single agent trials `trial_A` and `trial_B`. The historical data for this example is available in an internal data set. ```{r} kable(hist_combo2) ``` The objective is to aid dosing and dose-escalation decisions in a future trial, `trial_AB`, in which the drugs will be combined. Additionally, another investigator-initiated trial `IIT` will study the same combination concurrently. Note that these as-yet-unobserved sources of data are included in the input data as unobserved factor levels. This mechanism allows us to specify a joint meta-analytic prior for all four sources of historical and concurrent data. ```{r,echo=TRUE} levels(hist_combo2$group_id) ``` However, we will first consider only the dual combination trial AB and it's historical data and add concurrent data at a later stage. ### Setting up the trial design The function `blrm_trial` provides an object-oriented framework for operationalizing the dose-escalation trial design. This framework is intended as a convenient wrapper for the main model-fitting engine of the package, the `blrm_exnex()` function. The latter allows additional flexibility for specifying the functional form of the model, but `blrm_trial` covers the most common use-cases. This introductory vignette highlights `blrm_trial` in lieu of `blrm_exnex`; the reader is referred to the help-page of the function`?blrm_exnex` for more details. One begins with `blrm_trial` by specifying three key design elements: * The historical dose-toxicity data * Information about the study drugs * The provisional dose levels to be studied during the escalation trial Information about the study drugs is encoded through a `tibble` as below. This includes the names of the study-drugs, the reference doses (see [3] or `?blrm_exnex` to understand the role this choice plays in the model specification), the dosing units, and (optionally) the a priori expected DLT rate for each study drug given individually at the respective reference doses. All design information for the study described in [1] is also included as built-in datasets, which are part of the `OncoBayes2` package. #### Drug info ```{r} kable(drug_info_combo2) ``` #### Dose info The provisional dose levels are specified as below. For conciseness, we limit the dose level of in these provisional doses. ```{r} dose_info <- filter(dose_info_combo2, group_id == "trial_AB", drug_A %in% c(3,6), drug_B %in% c(0,400, 800)) kable(dose_info) ``` #### Initializing a `blrm_trial` Together with the data described in the previous section, these objects can be used to initialize a `blrm_trial` object. ```{r} combo2_trial_setup <- blrm_trial( data = hist_combo2, drug_info = drug_info_combo2, dose_info = dose_info ) ``` ### Specifying the prior and fitting the model At this point, the trial design has been initialized. However, in the absence of `simplified_prior = TRUE`, we have not yet specified the prior distribution for the dose-toxicity model. OncoBayes2 provides two methods for completing the model specification: 1. Use `simplified_prior = TRUE`, which employs a package-default prior distribution, subject to a small number of optional arguments controlling the details. 2. Provide a full prior specification to be passed to the `blrm_exnex` function. For simplicity and conciseness purposes, here we use method #1, which is not recommended for actual trials as the prior should be chosen deliberately and there is no guarantee that the simplified prior will remain stable across releases of the package. See `?'example-combo2_trial'` for an example of #2. The below choice of prior broadly follows the case study in [4], although we slightly deviate from the model in [4] by a different reference dose and mean reference DLT rate. To employ the simplified prior, and fit the model with MCMC: ```{r, message = FALSE, echo = TRUE, results = "hide"} combo2_trial_start <- blrm_trial( data = hist_combo2, drug_info = drug_info_combo2, dose_info = dose_info, simplified_prior = TRUE, EXNEX_comp=FALSE, EX_prob_comp_hist=1, EX_prob_comp_new=1 ) ``` Now, the object `combo2_trial_start` contains the posterior model fit at the start of the trial, in addition to the trial design details. Next we highlight the main methods for extracting relevant information from it. ### Summary of prior specification The function `prior_summary` provides a facility for printing, in a readable format, a summary of the prior specification. ```{r, eval = FALSE} prior_summary(combo2_trial_start) # not run here ``` ### Summary of posterior The main target of inference is generally the probability of DLT at a selection of provisional dose levels. To obtain these summaries for the provisional doses specified previously, we simply write: ```{r} kable(summary(combo2_trial_start, "dose_prediction"), digits = 2) ``` Such summaries may be used to assess which combination doses have unacceptable high risk of toxicity. For example, according to the escalation with overdose control (EWOC) design criteria [3], one would compute the posterior probability that each dose is excessively toxic (column `prob_overdose`; note that the definition of "excessively toxic" is encoded in the `blrm_trial` object through the `interval_prob` argument), and take as eligible doses only those where this probability does not exceed 25% (column `ewoc_ok`). ### Posterior accuracy of EWOC estimation Since the posterior is represented with a large sample of the target density, any estimate derived from it is subject to finite sampling error. The sampling error is determined by the posterior sample size and the quality of the used Markov chain Monte Carlo (MCMC). Hence, it is required to ensure that the MCMC chains have converged and that the number of samples representing the posterior is large enough to estimate desired quantities of interest with sufficient accuracy. The `OncoBayes2` package automatically warns in case of non-convergence as indicated by the Rhat diagnostic [5]. All model parameters must have an Rhat of less than $1.1$ (values much larger than $1.0$ indicate non-convergence). As the primary objective for a BLRM is to determine a safe set of doses via estimation of EWOC, the key quantities defining EWOC are monitored for convergence and sufficient accuracy for each pre-defined dose as well. These diagnostics can be obtained for the pre-defined set of doses via the `ewoc_check` summary routine as: ```{r} kable(summary(combo2_trial_start, "ewoc_check"), digits = 3) ``` For the standard EWOC criterion, the `prob_overdose_est` column contains the 75% quantile of the posterior DLT probability, which must be smaller than 33%. The `prob_overdose_stat` column is centered by 33% and standardized with the Monte-Carlo standard error (mcse). Therefore, negative values correspond to safe doses and since the quantity is approximately distributed as a standard normal random variate, the statistic can be compared with quantiles of the standard normal distribution. `OncoBayes2` will warn for an imprecise EWOC estimate whenever the statistic is within the range of the central 95% interval of $(-1.96,1.96)$. Whenever this occurs it can be useful to increase the number of iterations in order to decrease the mcse, which scales with the inverse of the square root of the MC ess. The MC ess is the number of independent samples the posterior corresponds to (recall that MCMC results in correlated samples). For more information please refer to the help of the `summary.blrm_trial` function (see `help("blrm_trial", help_type="summary")`). ```{r, include=FALSE} po <- summary(combo2_trial_start, "ewoc_check")$prob_overdose_stat min_stat <- po[which.min(abs(po))] ``` We can see that for the pre-defined doses of the trial the EWOC decision can be determined with more than enough accuracy given that the statistic closest to $0$ is $`r round(min_stat, 2)`$. ### Posterior predictive summaries The BLRM allows a principled approach to predicting the number of DLTs that may be observed in a future cohort. This may be a key estimand for understanding and limiting the toxicity risk to patients. For example, suppose a candidate starting dose for the new trial `trial_AB` is 3 mg of drug A + 400 mg of drug B. We may wish to check that at this dose, the predictive probability of 2 or more DLTs out of an initial cohort of 3 to 6 patients is sufficiently low. ```{r} candidate_starting_dose <- summary(combo2_trial_start, "dose_info") %>% filter(drug_A == 3, drug_B == 400) %>% crossing(num_toxicities = 0, num_patients = 3:6) pp_summary <- summary(combo2_trial_start, interval_prob = c(-1, 0, 1, 6), predictive = TRUE, newdata = candidate_starting_dose) kable(bind_cols(select(candidate_starting_dose, num_patients), select(pp_summary, ends_with("]"))), digits = 3) ``` This tells us that for the initial cohort, according to the model, the chance of two or more patients developing DLTs ranges from `r paste0(100 * round(pp_summary[1, "(1,6]"], 3), "%")` to `r paste0(100 * round(pp_summary[4, "(1,6]"], 3), "%")`, depending on the number of patients enrolled. ### Updating the model with new data Dose-escalation designs are adaptive in nature, as dosing decisions are made after each sequential cohort. The model must be updated with the accrued data for each dose escalation decision point. If a new cohort of patients is observed, say: ```{r} new_cohort <- tibble(group_id = "trial_AB", drug_A = 3, drug_B = 400, num_patients = 5, num_toxicities = 1) ``` One can update the model to incorporate this new information using `update()` with `add_data` equal to the new cohort: ```{r, message = FALSE, echo = TRUE, results = "hide"} combo2_trial_update <- update(combo2_trial_start, add_data = new_cohort) ``` This yields a new `blrm_trial` object with updated data and posterior summaries. Obtaining the summaries for the pre-planned provisional doses is then again straightforward: ```{r} kable(summary(combo2_trial_update, "dose_prediction"), digits = 2) ``` In case posterior summaries are needed for doses other than the pre-planned ones, then this is possible using the `newdata_prediction` functionality, which allows to specify a different set of doses via the `newdata` argument: ```{r} kable(summary(combo2_trial_update, "newdata_prediction", newdata = tibble(group_id = "trial_AB", drug_A = 4.5, drug_B = c(400, 600, 800))), digits = 2) ``` ### Data scenarios It may be of interest to test prospectively how this model responds in various scenarios for upcoming cohorts. This can be done easily by again using `update()` with the `add_data` argument. In the code below, we explore 3 possible outcomes for a subsequent cohort enrolled at 3 mg drug A + 800 mg drug B, and review the model's inference at adjacent doses. ```{r, message = FALSE, echo = TRUE, results = "hide"} # set up two scenarios at the starting dose level # store them as data frames in a named list scenarios <- expand_grid( group_id = "trial_AB", drug_A = 3, drug_B = 800, num_patients = 3, num_toxicities = 0:2 ) %>% split(1:3) %>% setNames(paste0(0:2, "/3 DLTs")) candidate_doses <- expand_grid( group_id = "trial_AB", drug_A = c(3, 4.5), drug_B = c(600, 800) ) scenario_inference <- lapply(scenarios, function(scenario_newdata) { # refit the model with each scenario's additional data scenario_fit <- update(combo2_trial_update, add_data = scenario_newdata) # summarize posterior at candidate doses summary(scenario_fit, "newdata_prediction", newdata = candidate_doses) }) %>% bind_rows(.id="Scenario") ``` ```{r, echo = FALSE} kable(select(scenario_inference, -group_id, -stratum_id, -dose_id), digits = 2, caption = "Model inference for trial AB when varying hypothetical DLT scenarios for a cohort of size 3") ``` ## Continuation of example: Using concurrent data In the example of [1], at the time of completion of the first stage of `trial_AB`, the following additional data was observed. ```{r} trial_AB_data <- filter(codata_combo2, group_id == "trial_AB", cohort_time==1) kable(trial_AB_data) ``` These data are easily incorporated into the model using another call to `update`, as below. ```{r, message = FALSE, echo = TRUE, results = "hide"} combo2_trial_histdata <- update(combo2_trial_start, add_data = trial_AB_data) ``` However, during the first stage of `trial_AB`, the `trial_A` studying drug A did continue and collected more data on the drug A dose-toxicity relationship: ```{r} trial_A_codata <- filter(codata_combo2, group_id == "trial_A", cohort_time==1) kable(trial_A_codata) ``` Wthin the MAC framework we may simply add the concurrent data to our overall model which yields refined predictions for future cohorts. ```{r, message = FALSE, echo = TRUE, results = "hide"} combo2_trial_codata <- update(combo2_trial_histdata, add_data = trial_A_codata) ``` To compare the effect of co-data in this case it is simplest to visualize the interval probabilities as predicted by the model for the different data constellations. Here we use the function `plot_toxicity_intervals_stacked` to explore the dose-toxicity relationship in a continuous manner in terms of the dose. ```{r, fig.height = 1.2 * 4, fig.width=1.62 * 4} plot_toxicity_intervals_stacked(combo2_trial_histdata, newdata=mutate(dose_info, dose_id=NULL, stratum_id="all"), x = vars(drug_B), group = vars(drug_A), facet_args = list(ncol = 1) ) + ggtitle("Trial AB with historical data only") plot_toxicity_intervals_stacked(combo2_trial_codata, newdata=mutate(dose_info, dose_id=NULL, stratum_id="all"), x = vars(drug_B), group = vars(drug_A), facet_args = list(ncol = 1) ) + ggtitle("Trial AB with historical and concurrent data on drug A") ``` As we can observe, the additional data on drug A moves the maximal admissible dose allowed by EWOC towards higher doses for drug B whenever drug A is 6 mg. This reflects that drug A has been observed to be relatively safe, since no DLT was observed for a number of doses. In the example of [1], during the conduct of the second stage of the `trial_AB` an additional external data source from a new trial became available. This time it is stemming from another trial which is an investigator-initiated trial `IIT` of the same combination. Numerous toxicities were observed in this concurrent study as stage 2 of `trial_AB`. ```{r} trial_AB_stage_2_codata <- filter(codata_combo2, cohort_time==2) kable(trial_AB_stage_2_codata) ``` As before, through the MAC framework, these data can influence the model summaries for `trial_AB`. We leave it to the reader to explore the differences in the co-data (combined historical and concurrent data) vs the historical data only approach. To conclude we present a graphical summary of the dose-toxicity relationship for the dual combination trial for the final data constellation. Note that we use the `data` option of `update` here to ensure that we use an entirely new dataset which includes all data collected; so this includes historical, trial and concurrent data: ```{r, message = FALSE, echo = TRUE, results = "hide"} combo2_trial_final <- update(combo2_trial_start, data = codata_combo2) ``` As final summary we consider the 75% quantile of the probability for a DLT at all dose combinations. Whenever the 75% quantile exceeds 33%, then the EWOC criterion is violated and the dose is too toxic. ```{r, fig.height = 1.05 * 4, fig.width=1.62 * 4} grid_length <- 25 dose_info_plot_grid <- expand_grid(stratum_id = "all", group_id = "trial_AB", drug_A=seq(min(dose_info_combo2$drug_A), max(dose_info_combo2$drug_A), length.out=grid_length), drug_B=seq(min(dose_info_combo2$drug_B), max(dose_info_combo2$drug_B), length.out=grid_length)) dose_info_plot_grid_sum <- summary(combo2_trial_final, newdata=dose_info_plot_grid, prob=0.5) ggplot(dose_info_plot_grid_sum, aes(drug_A, drug_B, z = !!as.name("75%"))) + geom_contour_filled(breaks=c(0, 0.1, 0.16, 0.33, 1)) + scale_fill_brewer("Quantile Range", type="div", palette = "RdBu", direction=-1) + ggtitle("DLT Probability 75% Quantile") ``` ## References [1] Neuenschwander, B., Roychoudhury, S., & Schmidli, H. (2016). On the use of co-data in clinical trials. Statistics in Biopharmaceutical Research, 8(3), 345-354. [2] Neuenschwander, B., Wandel, S., Roychoudhury, S., & Bailey, S. (2016). Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical statistics, 15(2), 123-134. [3] Neuenschwander, B., Branson, M., & Gsponer, T. (2008). Critical aspects of the Bayesian approach to phase I cancer trials. Statistics in medicine, 27(13), 2420-2439. [4] Neuenschwander, B., Matano, A., Tang, Z., Roychoudhury, S., Wandel, S. Bailey, Stuart. (2014). A Bayesian Industry Approach to Phase I Combination Trials in Oncology. In Statistical methods in drug combination studies (Vol. 69). CRC Press. [5] Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., Bürkner, P. C. (2021). Rank-Normalization, Folding, and Localization: An Improved ($\hat{R}$) for Assessing Convergence of MCMC, Bayesian Analysis, 16 (2), 667–718. https://doi.org/10.1214/20-BA1221 ## Session Info ```{r} sessionInfo() ``` ```{r, include=FALSE} ## restore previous global user options options(.user_mc_options) ```