CoRpower’s Algorithms for Simulating Placebo Group and Baseline Immunogenicity Predictor Data

Introduction

The CoRpower package assumes that \(P(Y^{\tau}(1)=Y^{\tau}(0))=1\) for the biomarker sampling timepoint \(\tau\), which renders the CoR parameter \(P(Y=1 \mid S=s_1, Z=1, Y^{\tau}=0)\) equal to \(P(Y=1 \mid S=s_1, Z=1, Y^{\tau}(1)=Y^{\tau}(0)=0)\), which links the CoR and biomarker-specific treatment efficacy (TE) parameters. Estimation of the latter requires outcome data in placebo recipients, and some estimation methods additionally require availability of a baseline immunogenicity predictor (BIP) of \(S(1)\), the biomarker response at \(\tau\) under assignment to treatment. In order to link power calculations for detecting a correlate of risk (CoR) and a correlate of TE (coTE), CoRpower allows to export simulated data sets that are used in CoRpower’s calculations and that are extended to include placebo-group and BIP data for harmonized use by methods assessing biomarker-specific TE. This vignette aims to describe CoRpower’s algorithms, and the underlying assumptions, for simulating placebo-group and BIP data. The exported data sets include full rectangular data to allow the user to consider various biomarker sub-sampling designs, e.g., different biomarker case:control sampling ratios, or case-control vs. case-cohort designs.


Algorithms for Simulating Placebo Group Data

Trichotomous \(\, X\) and \(\, S(1)\) Using Approach 1

  1. Specify \(P^{lat}_0\), \(P^{lat}_2\), \(P_0\), \(P_2\), \(risk_0\), \(n_{cases, 0}\), \(n_{controls, 0}\), \(K\)
    • \(N_{complete, 0} = n_{cases, 0} + n_{controls, 0}\)
  2. Specify \(Sens\), \(Spec\), \(FP^0\), and \(FN^2\)
  3. Number of observations in each latent subgroup: \(N_x = N_{complete, 0} P^{lat}_x\)
  4. Simulate \(X\) under the assumption of homogeneous risk in the placebo group:
    • Cases: \(\left(n_{cases, 0}(0),n_{cases,0}(1),n_{cases,0}(2)\right) \sim \mathsf{Mult}(n_{cases,0},(p_0,p_1,p_2))\), where \[\begin{align*} p_x=P(X=x|Y=1,Y^{\tau}=0,Z=0) &= P(X=x|Y(0)=1)\\ &= \frac{P(Y(0)=1|X=x)P(X=x)}{P(Y(0)=1)}\\ &= \frac{risk^{lat}_0(x)P^{lat}_{x}}{risk_0}\\ &= P^{lat}_{x} \quad \text{because } risk^{lat}_0(x)=risk_0 \end{align*}\]
    • Controls: \(\left(n_{controls,0}(0),n_{controls,0}(1),n_{controls,0}(2)\right) \sim \mathsf{Mult}(n_{controls,0},(p_0,p_1,p_2))\), where \[\begin{align*} p_x=P(X=x|Y=0,Y^{\tau}=0,Z=0) &= P(X=x|Y(0)=0)\\ &= \frac{P(Y(0)=0|X=x)P(X=x)}{P(Y(0)=0)}\\ &= \frac{(1-risk^{lat}_0(x))P^{lat}_{x}}{(1-risk_0)}\\ &= P^{lat}_{x} \quad \text{because } risk^{lat}_0(x)=risk_0 \end{align*}\]
    • \(n_{controls,0}(x) = N_x - n_{cases,0}(x)\)
  5. Simulate \(Y\): Vector with \(n_{cases,0}(0)\) 1’s, followed by \(n_{controls,0}(0)\) 0’s, followed by \(n_{cases,0}(1)\) 1’s, etc.
  6. Simulate \(S(1)\): For each of the \(N_x\) subjects, generate \(S(1)\) by a draw from \(\mathsf{Mult}(1,(p_0,p_1,p_2))\), where \(p_k=P(S(1)=k|X=x)\) is given by \(Sens, Spec\), etc.

Trichotomous \(\, X\) and \(\, S(1)\) Using Approach 2

  1. Specify \(P^{lat}_0\), \(P^{lat}_2\), \(P_0\), \(P_2\), \(risk_0\), \(N_{complete,0}\), \(n_{cases,0}\), \(n^S_{cases}\), \(K\)
  2. Specify \(\rho\) and \(\sigma^2_{obs}\)
  3. Calculation of \((Sens, Spec, FP^0, FP^1, FN^1, FN^2)\):
    1. Assuming the classical measurement error model, where \(X^{\ast} \sim \mathsf{N}(0,\sigma^2_{tr})\), solve \[P^{lat}_0 = P(X^{\ast} \leq \theta_0) \quad \textrm{and} \quad P^{lat}_2 = P(X^{\ast} > \theta_2)\] for \(\theta_0\) and \(\theta_2\)
    2. Generate \(B\) realizations of \(X^{\ast}\) and \(S^{\ast} = X^{\ast} + e\), where \(e \sim \mathsf{N}(0,\sigma^2_{e})\), and \(X^{\ast}\) independent of \(e\) + \(B = 20,000\) by default
    3. Using \(\theta_0\) and \(\theta_2\) from Step i., define \[\begin{align*} Spec(\phi_0) &= P(S^{\ast} \leq \phi_0 \mid X^{\ast} \leq \theta_0)\\ FN^1(\phi_0) &= P(S^{\ast} \leq \phi_0 \mid X^{\ast} \in (\theta_0,\theta_2])\\ FN^2(\phi_0) &= P(S^{\ast} \leq \phi_0 \mid X^{\ast} > \theta_2)\\ Sens(\phi_2) &= P(S^{\ast} > \phi_2 \mid X^{\ast} > \theta_2)\\ FP^1(\phi_2) &= P(S^{\ast} > \phi_2 \mid X^{\ast} \in (\theta_0,\theta_2])\\ FP^0(\phi_2) &= P(S^{\ast} > \phi_2 \mid X^{\ast} \leq \theta_0) \end{align*}\]

      Estimate \(Spec(\phi_0)\) by \[\widehat{Spec}(\phi_0) = \frac{\#\{S^{\ast}_b \leq \phi_0, X^{\ast}_b \leq \theta_0\}}{\#\{X^{\ast}_b \leq \theta_0\}}\,\] etc.
    4. Find \(\phi_0 = \phi^{\ast}_0\) and \(\phi_2 = \phi^{\ast}_2\) that numerically solve \[\begin{align*} P_0 &= \widehat{Spec}(\phi_0)P^{lat}_0 + \widehat{FN}^1(\phi_0)P^{lat}_1 + \widehat{FN}^2(\phi_0)P^{lat}_2\\ P_2 &= \widehat{Sens}(\phi_2)P^{lat}_2 + \widehat{FP}^1(\phi_2)P^{lat}_1 + \widehat{FP}^0(\phi_2)P^{lat}_0 \end{align*}\] and compute \[ Spec = \widehat{Spec}(\phi^{\ast}_0),\; Sens = \widehat{Sens}(\phi^{\ast}_2),\; \textrm{etc.} \]
  4. Follow Steps 3–6 under Approach 1

Continuous \(\, X^*\) and \(\, S^*(1)\)

  1. Specify \(P^{lat}_{lowestVE}\), \(\rho\), \(\sigma^2_{obs}\), \(VE_{lowest}\), \(risk_0\), \(n_{cases,0}\), \(n_{controls, 0}\), \(n^S_{cases}\), \(K\)
    • \(N_{complete, 0} = n_{cases, 0} + n_{controls, 0}\)
  2. Simulate \(Y\) by creating a vector with \(n_{cases,0}\) 1’s followed by \(n_{controls,0}\) 0’s.
  3. Simulate \(X^*\) under the assumption of homogeneous risk in the placebo group:
    • Cases: from a grid of values ranging from -3 to 3, sample \(n_{cases,0}\) with replacement from: \[\begin{align*} f_{X^{\ast}}(x^{\ast}|Y=1,Y^{\tau}=0,Z=0) &= f_{X^{\ast}}(x^{\ast}|Y(0)=1)\\ &= \frac{P(Y(0)=1|X^*=x^*)f_{X^{\ast}}(x^{\ast})}{P(Y(0)=1)}\\ &= \frac{risk^{lat}_0(x^*)f_{X^{\ast}}(x^{\ast})}{risk_0}\\ &= f_{X^{\ast}}(x^{\ast}) \quad \text{because } risk^{lat}_0(x^*)=risk_0 \end{align*}\]
    • Controls: from a grid of values ranging from -3 to 3, sample \(n_{controls,0}\) with replacement from: \[\begin{align*} f_{X^{\ast}}(x^{\ast}|Y=0,Y^{\tau}=0,Z=0) &= f_{X^{\ast}}(x^{\ast}|Y(0)=0)\\ &= \frac{P(Y(0)=0|X^*=x^*)f_{X^{\ast}}(x^{\ast})}{P(Y(0)=0)}\\ &= \frac{(1-risk^{lat}_0(x^*))f_{X^{\ast}}(x^{\ast})}{1-risk_0}\\ &= f_{X^{\ast}}(x^{\ast}) \quad \text{because } risk^{lat}_0(x^*)=risk_0 \end{align*}\]
    • \(f_{X^{\ast}}(x^{\ast})\) is fully specified because \(X^* \sim N(0, \sigma^2_{tr})\)
  4. Simulate \(S^*(1)\): \(S^*(1)=X^*+\epsilon,\) where \(\epsilon \sim N(0, \sigma^2_e)\) and \(\sigma_e^2=(1-\rho)\sigma^2_{obs}\). \(\epsilon\) is independent of \(X^*\) and is simulated by rnorm(Ncomplete, mean=0, sd=sqrt(sigma2e))

Algorithms for Simulating a Baseline Immunogenicity Predictor (BIP)

Trichotomous \(\, X, S(1),\) and \(\, BIP\) Using Approach 1

  1. The user specifies a classification rule defined by \(P(BIP=i \mid S(1)=j)\), \(i,j=0,1,2\).
  2. For a subject with biomarker measurement \(S_k(1)\), generate \(BIP_k\) by a draw from \(\mathsf{Mult}(1, (q_0, q_1, q_2))\), where \(q_i=P(BIP_k=i \mid S(1)=S_k(1))\), \(i=0,1,2\).

Trichotomous \(\, X, S(1),\) and \(\, BIP\) Using Approach 2

Note: All variables with * are continuous.

  1. The user specifies \(\mathop{\mathrm{corr}}(BIP^*, S^*(1))\).
  2. Assuming that \(BIP^*\) follows an additive measurement error model, i.e., \(BIP^* := S^*(1) + \delta\), where \(\delta \sim N(0, \sigma^2_{\delta})\) with an unknown \(\sigma^2_{\delta}\), and \(\delta, \epsilon\), and \(X^*\) are independent, solve the following equation for \(\mathop{\mathrm{var}}\delta = \sigma^2_{\delta}\): \[ \mathop{\mathrm{corr}}(BIP^*, S^*(1)) = \sqrt\frac{\mathop{\mathrm{var}}X^* + \mathop{\mathrm{var}}\epsilon}{\mathop{\mathrm{var}}X^* + \mathop{\mathrm{var}}\epsilon + \mathop{\mathrm{var}}\delta} \]
  3. For the fixed \(\phi^{\ast}_0\) and \(\phi^{\ast}_2\) derived above, define \[\begin{align*} Spec_{BIP}(\xi_0) &= P(BIP^{\ast} \leq \xi_0 \mid S^{\ast} \leq \phi^{\ast}_0)\\ FN^1_{BIP}(\xi_0) &= P(BIP^{\ast} \leq \xi_0 \mid S^{\ast} \in (\phi^{\ast}_0,\phi^{\ast}_2])\\ FN^2_{BIP}(\xi_0) &= P(BIP^{\ast} \leq \xi_0 \mid S^{\ast} > \phi^{\ast}_2)\\ Sens_{BIP}(\xi_2) &= P(BIP^{\ast} > \xi_2 \mid S^{\ast} > \phi^{\ast}_2)\\ FP^1_{BIP}(\xi_2) &= P(BIP^{\ast} > \xi_2 \mid S^{\ast} \in (\phi^{\ast}_0,\phi^{\ast}_2])\\ FP^0_{BIP}(\xi_2) &= P(BIP^{\ast} > \xi_2 \mid S^{\ast} \leq \phi^{\ast}_0) \end{align*}\]
  4. Using the same technique as in the derivation of \(\phi^{\ast}_0\) and \(\phi^{\ast}_2\) above, find \(\xi_0=\xi^{\ast}_0\) and \(\xi_2=\xi^{\ast}_2\) that numerically solve \[\begin{align*} P_0 &= \widehat{Spec}_{BIP}(\xi_0)P_0 + \widehat{FN}_{BIP}^1(\xi_0)P_1 + \widehat{FN}_{BIP}^2(\xi_0)P_2\\ P_2 &= \widehat{Sens}_{BIP}(\xi_2)P_2 + \widehat{FP}_{BIP}^1(\xi_2)P_1 + \widehat{FP}_{BIP}^0(\xi_2)P_0 \end{align*}\] and compute \[ Spec_{BIP} = \widehat{Spec}_{BIP}(\xi^{\ast}_0),\; Sens_{BIP} = \widehat{Sens}_{BIP}(\xi^{\ast}_2),\; \textrm{etc.} \]
  5. For a subject with biomarker measurement \(S_k(1)\), generate \(BIP_k\) by a draw from \(\mathsf{Mult}(1, (q_0, q_1, q_2))\), where \(q_i\), \(i=0,1,2\), are determined by \(Sens_{BIP}\), \(Spec_{BIP}\), etc. obtained in Step 4.

Continuous \(\, X^*, S^*(1),\) and \(\, BIP^*\)

  1. The user specifies \(\mathop{\mathrm{corr}}(BIP^*, S^*(1))\).
  2. Assuming that \(BIP^*\) follows an additive measurement error model, i.e., \(BIP^* := S^*(1) + \delta\), where \(\delta \sim N(0, \sigma^2_{\delta})\) with an unknown \(\sigma^2_{\delta}\), and \(\delta, \epsilon\), and \(X^*\) are independent, solve the following equation for \(\mathop{\mathrm{var}}\delta = \sigma^2_{\delta}\): \[ \mathop{\mathrm{corr}}(BIP^*, S^*(1)) = \sqrt\frac{\mathop{\mathrm{var}}X^* + \mathop{\mathrm{var}}\epsilon}{\mathop{\mathrm{var}}X^* + \mathop{\mathrm{var}}\epsilon + \mathop{\mathrm{var}}\delta} \]
  3. For a subject with biomarker measurement \(S^*_k(1)\), generate \(BIP^*_k\) as \(BIP^*_k = S^*_k(1) + \delta\) using \(\sigma^2_{\delta} = \mathop{\mathrm{var}}\delta\) obtained in Step 2.