--- title: "pwr4exp: Power Analysis for Experimental Designs" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{pwr4exp: Power Analysis for Experimental Designs} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: markdown: wrap: 72 --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(pwr4exp) ``` # Foundational Concepts The **pwr4exp** R package offers functionality for power analysis. The package is developed based on linear mixed model (LMM) theory, offering tailored functions for standard experimental designs in animal science and beyond. The current version does not yet support non-normal response variables, such as those encountered in generalized LMM. Linear mixed model is a powerful tool for analyzing data from various experimental designs, especially when accounting for both fixed and random effects. The general form of an LMM can be expressed as: $$ y = X\beta + Zu + \varepsilon $$ where: $y$ represents the observations of the response variable, $\beta$ represents the fixed effect coefficients, $u$ denotes the random effects, where $u \sim N_q(0, G)$, $\varepsilon$ represents the random errors, where $\varepsilon \sim N_n(0, R)$, $X_{(n \times p)}$ and $Z_{(n \times q)}$ are the design matrices for the fixed and random effects, respectively. It is assumed that $u$ and $\varepsilon$ are independent, and the marginal distribution of $y$ follows a normal distribution $y \sim N_n(X\beta, V)$, where: $$ V = ZGZ^T + R $$ ## Inference on Treatment Effects Inference on treatment effects often involves testing omnibus hypotheses and contrasts. These can be formulated using the general linear hypothesis: $$ H_0: K'\beta = 0 $$ where $K'$ is a contrast matrix. When the variance-covariance parameters in $G$ and $R$ are known, the estimate of $\beta$ is: $$ \hat{\beta} = (X^TV^{-1}X)^{-1}X^TV^{-1}y $$ And its variance is: $$ C = (X^TV^{-1}X)^{-1} $$ The sampling distribution of $K'\hat{\beta}$ is: $$ K'\hat{\beta} \sim N(0, K'CK) $$ However, in practical situations, the matrices $G$ and $R$ are unknown and must be estimated using methods like Maximum Likelihood (ML) or Restricted ML (REML). The estimate of $\beta$ is obtained by plugging in the estimated covariance matrices $\hat{V}$, where: $$ \hat{V} = Z\hat{G}Z^T + \hat{R} $$ The resulting estimate of $\beta$ is: $$ \hat{\beta} = (X^T\hat{V}^{-1}X)^{-1}X^T\hat{V}^{-1}y $$ And its estimated variance is: $$ \hat{C} = (X^T\hat{V}^{-1}X)^{-1} $$ When testing the null hypothesis $H_0: K'\beta = 0$, an approximate F-statistic is used. The F-statistic is given by: $$ F = \frac{(K'\hat{\beta})' [K'\hat{C}K']^{-1} (K'\hat{\beta})}{\text{rank}(K)} $$ $F$ follows an approximate F-distribution $F(v_1, v_2)$ under $H_0$, where $v_1 = \text{rank}(K) \geq 1$ represents the numerator degrees of freedom (df), $v_2$ is the denominator df. When $\text{rank}(K) = 1$, the F-statistic simplifies to the square of the t-statistic: $$ F = t^2 $$where $t = \frac{K'\hat{\beta}}{\sqrt{K'\hat{C}K}}$ with $v_2$ df. In balanced designs, where data is analyzed using a variance components model—commonly applied in experimental animal research—$v_2$​ can be precisely determined through degrees of freedom decomposition, as applied in analysis of variance (ANOVA): $$ v_2 = n - 1 - [\text{rank}(X) - 1] - [\text{rank}(Z) - 1] $$ However, for general cases, such as unbalanced designs or models with correlated random intercept and slope effects, $v_2$ must be approximated using methods like Satterthwaite’s approximation (Fai and Cornelius, 1996) or Kenward-Roger’s method (Kenward and Roger, 1997), as implemented in the **lmerTest** package. ## Power Calculation Under the Alternative Hypothesis Under the alternative hypothesis $H_A: K'\beta \neq 0$, the F-statistic follows a non-central distribution $F(v_1, v_2, \phi)$, where $\phi$ is the non-centrality parameter that measures the departure from the null hypothesis $H_0$. The non-centrality parameter $\phi$ is given by: $$ \phi = (K'\hat{\beta})' [K'\hat{C}K']^{-1} (K'\hat{\beta}) $$ Once the distribution of the F-statistic under $H_A$ is known, the power of the test can be calculated as the conditional probability of rejecting $H_0$ when $H_A$ is true: $$ \text{Power} = P(\text{reject } H_0: F > F_{\text{crit}} \mid H_A) $$ Where: $F_{\text{crit}}$ is the critical value of the F-statistic used to reject $H_0$, determined such that $P(F > F_{\text{crit}} \mid H_0) = \alpha$, where $\alpha$ is the type I error rate. The determination of the degrees of freedom $v_1$ and $v_2$, as well as the non-centrality parameter $\phi$, are critical steps for power calculation. Notably, $\phi$ resembles the numerator of the F-statistic but with population parameters for $\beta$, $G$, and $R$ replacing their sample estimates $\hat{\beta}$, $\hat{G}$, and $\hat{R}$. Generally, power analysis requires specifying the following components: - The **Design** to be evaluated, which determines the matrices $X$ (fixed effects) and $Z$ (random effects). - The **Treatment Effects**, which determine $\beta$ (the fixed effect coefficients). - The **Variance Components**, which determine $G$ (the covariance matrix of the random effects) and $R$ (the covariance matrix of the residuals). A key aspect of conducting a valid power analysis is obtaining **reasonable estimates for the magnitude of the parameters** that will be used in the model. This includes: - **Treatment Effects** ($\beta$): The size of the treatment effect(s) you expect to detect. This can be obtained from previous studies, pilot experiments, or subject-matter expertise. - **Variance Components** ($G$ and $R$): These represent the variability in the random effects and residual errors. Variance estimates are often obtained from: - **Pilot studies** or preliminary data, which can provide initial estimates of variability in random effects (e.g., subject-to-subject variability or group-level variability). - **Literature** on similar experiments, where variance components have been reported. - **Subject-matter expertise**, where researchers provide estimates based on their knowledge of the system being studied. Performing a power analysis with unrealistic parameter magnitudes can lead to incorrect conclusions, either overestimating the likelihood of detecting a treatment effect or requiring an unnecessarily large sample size. # Getting Started This section provides an overview of the package's functionality. The **pwr4exp** package is designed to streamline statistical power calculations into a simple, user-friendly pipeline. Performing a power analysis in **pwr4exp** involves two steps: creating a design object and then calculating power or determining sample size. ## Step1: Creating a Design Object Design objects in **pwr4exp** can be created using functions that generate several standard experimental designs available in the package. These functions include: - `designCRD(treatments, label, replicates, formula, beta, sigma2)`: For completely randomized design (CRD). - `designRCBD(treatments, label, blocks, formula, beta, VarCov, sigma2, ...)`: For randomized complete block design (RCBD). - `designLSD(treatments, label, squares, reuse, formula, beta, VarCov, sigma2, ...)`: For Latin square design (LSD). - `designCOD(treatments, label, squares, formula, beta, VarCov, sigma2, ...)`: For crossover design (COD), which is a special case of LSD with time periods and individuals as blocks. Period blocks are reused when replicating squares. - `designSPD(trt.main, trt.sub, label, replicates, formula, beta, VarCov, sigma2, ...)`: For split-plot design (SPD). ### Arguments The arguments of these functions fall into the following main categories: #### **Treatment structure** The arguments `treatments`, `trt.main`, and `trt.sub` are used to specify the treatment structure. For designs other than SPD, `treatments` defines the structure, whereas in SPD, `trt.main` and `trt.sub` refer to the main plot and subplot levels, respectively. The treatment structure is specified by an integer-valued vector, where the length of the vector represents the number of treatment factors, and each value indicates the number of levels for each factor. A maximum of two factors is allowed (for SPD, this applies to both the main plot and subplot levels), arranged in a factorial design. For instance, `treatments = 2` defines an experiment involving two treatments (e.g., control vs. intervention). For two factors, `treatments = c(2, 2)` sets up a "2x2" factorial design to study main effects and interactions. In the case of SPD, `trt.main = 2` specifies two levels of the main plot factor, while `trt.sub = c(2, 2)` defines a "2x2" factorial design at the subplot level. #### **Label** The optional `label` argument is a list of character vectors that specifies the names of treatment factors and their corresponding levels. Each vector in the list represents a treatment factor, where the name of the vector defines the name of the factor, and the values in the vector are the labels for the levels of that factor. If the `label` argument is not provided, default names are assigned to the factors and levels. For one treatment factor, the default is `list(trt = c("1", "2", ...))`. For two factors, the default is `list(facA = c("1", "2", ...), facB = c("1", "2", ...))`, where "facA" and "facB" represent the two factors, and "1", "2", etc., represent the levels of each factor. For example, `list(trt = c("ad libitum", "fasting"))` customizes the levels of a single treatment factor to "ad libitum" and "fasting." For multiple factors, `list(feed = c("ad libitum", "fasting"), dosage = c("D0", "D1", "D2"))` names the first factor "feed" with levels "ad libitum" and "fasting," and the second factor "dosage" with levels "D0," "D1," and "D2." #### **Replication** Given the distinct randomization and replication mechanisms across designs, the arguments `replicates`, `blocks`, and `squares` are used to represent replication and to indicate the sample size for different designs: - The argument `replicates` specifies the number of experimental units per treatment in a CRD or the number of main plots (i.e., the number of experimental units per treatment at the main plot level) in a SPD. - `blocks` specifies the number of blocks in a RCBD. - `squares` specifies the number of squares in a replicated LSD or COD. In a CRD, setting `replicates = 10` and `treatments = 4` (or `treatments = c(2, 2)`) means that each treatment group consists of 10 experimental units, resulting in a total of 40 experimental units. When configuring an SPD, `replicates = 10` with `trt.main = 4` (or `trt.main = c(2, 2)`) signifies that each main plot treatment is replicated across 10 experimental units, totaling 40 main plots. For an RCBD, using `blocks = 10` along with `treatments = 4` (or `treatments = c(2, 2)`) ensures that all four treatments are replicated across 10 different blocks, leading to a total of 40 experimental units. In an LSD, setting `squares = 3` with `treatments = 4` (or `treatments = c(2, 2)`) implies the replication of a single "4×4" square layout 3 times, resulting in a total of 48 experimental units. #### **Model** The `formula` argument specifies the model formula that will be used to test effects during post-experimental data analysis. This formula follows the same syntax used in R’s `lm` function (for linear models) and `lmer` function (for linear mixed models) to specify fixed and random effects. Each design-generating function within the package comes with a default model formula. The default formula incorporates interaction terms when two treatment factors are present and fits blocking factors as random effects where applicable. You can inspect the default formula from the generated design object using `design$formula`, or by checking the function's documentation (`?function`). For example, in a SPD with one treatment factor at the main plot level and two factors at the subplot level, the default model formula would be: `y ~ trt.main * facA.sub * facB.sub + (1 | mainplot)`. This formula tests all interactions, including three-way interactions. If no interaction between the main plot and subplot factors is assumed, the model formula can specify the model as : `y ~ trt.main + facA.sub * facB.sub + (1 | mainplot)` by the user. Notably, when specifying the model formula manually, the names of the treatment factors must be consistent with those provided in the `label` argument. #### **Effect Size** The `beta` argument is a numeric vector of expected model coefficients, representing the effect sizes. The first element corresponds to the intercept term, which represents the mean of the reference level for categorical variables. Subsequent elements correspond to the effect sizes of the independent variables in the order they appear in the model matrix. For categorical variables, each coefficient represents the difference between a non-reference level and the reference level (intercept), as the `contr.treatment` contrast coding is used to construct the model matrix. It is important to ensure that the `beta` vector aligns with the columns of the model matrix, including any dummy variables created for categorical predictors. These values can either be specified directly or transformed from group means. For example, consider a factor with 2 levels (`treatments = 2`), representing control vs. intervention (`label = list(trt = c("control", "intervention"))`). If `beta = c(10, 5)`, this indicates that the mean of the control group is 10, and the effect of the intervention is 5 units higher than the control. In another example, consider a "2 × 2" factorial arrangement (`treatments = c(2, 2)` & `label= list(A = c("A1", "A2"), B = c("B1", "B2"))`): $$ \begin{array}{c|c|c} & B1 & B2 \\ \hline A1 & 10 & 6 \\ A2 & 8 & 12 \\ \end{array} $$ The `beta` argument is specified as `beta = c(10, -2, -4, 8)`, which represents the following effects: - The mean of the reference level (`A1B1`) is 10. - The effect of `A2` alone is -2 (i.e., `A2B1 - A1B1`). - The effect of `B2` alone is -4 (i.e., `A1B2 - A1B1`). - The interaction between `A2` and `B2` is 8, representing the additional effect of combining `A2` and `B2` compared to what would be expected from the sum of their individual effects (`A2B2 - A2B1 - A1B2 + A1B1`). #### **Variance-Covariance** The `VarCov` argument specifies the variance-covariance components of random effects. If multiple terms exist for a single random effect group, the variance-covariance matrix should be provided. For example, the covariance matrix for random intercepts and random slopes for a single grouping factor is structured as: $$ \begin{pmatrix} \tau_0^2 & \tau_{12} \\ \tau_{12} & \tau_1^2 \end{pmatrix} $$ where $\tau_0^2$ represents the variance of the random intercept, $\tau_1^2$ represents the variance of the random slope, and $\tau_{12}$ is the covariance between them. For multiple random effect groups, supply the variance (for a single random effect term) or the variance-covariance matrix (for two or more random effect terms) of each group in a list, following the order specified in the model formula. In the standard designs available in **pwr4exp**, the corresponding LMMs are typically variance component models, i.e., models without random slopes. For example, in an RCBD with block as a random effect, the required input is the variance between blocks: `VarCov = \tau_b^2`. If there are multiple random effects, for example, in an LSD with both row and column blocks as random effects, the required input would be `VarCov = list(\tau_r^2, \tau_c^2)`, representing the variances of the row and column blocks, respectively. #### **Error Variance** The `sigma2` argument represents the variance of the random error in the model. This value specifies the error variance, which captures the unexplained variability within the model that is not accounted for by the fixed or random effects. It is an important parameter for determining the power. ### Customized design - `designCustom(design.df, formula, beta, VarCov, sigma2, design.name, ...)` If a design is not predefined in the package, it can be constructed using the `designCustom` function. The required inputs are `design.df`, `formula`, `beta`, `VarCov`, `sigma2`, and optionally, `design.name`. All arguments have been defined above, except for `design.df`, which refers to a data frame containing the columns of independent variables, outlining the structure of the data to be collected in the experiment. Note that this data frame does not include a response variable. The `design.name` argument allows for specifying a custom name for the design, if desired. ## Step2: Calculating Power or Sample Size Once the design object is created, calculating power or sample size is straightforward. Power for omnibus tests, including main effects and their interactions (if specified in the model during the creation of the design object), can be calculated using: - `pwr.anova(design, alpha = 0.05, ...)` The required inputs include: 1. `design`, the object created in [Step 1](##%20Step1:%20Creating%20a%20Design%20Object); 2. `alpha`, indicating the Type I error rate, with a default value of 0.05. For specific contrasts, power can be calculated using: - `pwr.contrast(design, alpha = 0.05, spec, method, ...)` The syntax of the **emmeans** package is inherited to specify contrasts of interest. The required inputs are: 1. `design`; 2. `alpha`; 3. `spec`, an argument inherited from **emmeans**, which specifies the names of the factors over which the contrasts are performed. 4. `method`, another argument inherited from **emmeans**, which specifies the method of contrasts (e.g., "pairwise", "trt.vs.ctrl", "poly"). The minimal sample size needed to achieve a target power can be determined using: - `find_sample_size(design.quote, alpha = 0.05, target.power = 0.8, n_init = 2, n_max = 99)` This function calculates the minimum sample size necessary by incrementally checking integers from `n_init` to `n_max`. The required inputs are: 1. `design.quote`: A quoted design object with an unknown and unevaluated replication argument, to be evaluated with varying values; 2. `alpha`; 3. `target.power`: A single value specifying the target power for all effects, or a vector specifying individual target power levels for each effect, with a default value of 0.05; 4. `n_init`: The initial number of replications for the iterative process, with a default value of 2; 5. `n_max`: The maximum number of replications for the iterative process, with a default value of 99. Currently, sample size determination is available only for omnibus tests and not for specific contrasts in **pwr4exp**. # Practical Examples ## Example 1. Completely Randomized Design In this example, we will create a CRD with one treatment factor. The design parameters are as follows: 1. Treatments: 1 treatment factor with 4 levels; 2. Replicates: 8 experimental units per treatment. 3. Mean and effect size: The mean of control is 35, and the effects of other three treatments are -5, +2, and +3. 4. Error variance: The variance of response variable is 15. **Create the CRD** ```{r} crd <- designCRD( treatments = 4, replicates = 8, beta = c(35, -5, 2, 3), sigma2 = 15 ) ``` **Power for the omnibus test** The power of the omnibus test (i.e., F-test) can be calculated using the `pwr.anova` function. Under the type I error rate of 0.05, the power for testing an overall difference among treatments is 0.95467. ```{r} pwr.anova(design = crd) ``` **Power for specific contrasts** To assess the power for specific contrasts, use the `pwr.contrast` function. For example, to calculate the power for detecting differences between treatments and the control: ```{r} pwr.contrast(design = crd, specs = ~ trt, method = "trt.vs.ctrl") ``` To calculate the power for detecting linear or polynomial trends across the treatment levels: ```{r} pwr.contrast(design = crd, specs = ~ trt, method = "poly") ``` ## Example 2. Randomized Complete Block Design In this example, we will create an RCBD with two treatment factors. The design parameters are as follows: 1. Treatments: A 2x2 factorial design. 2. Replicates: 8 blocks. 3. Mean and effect size: The mean of the control (A1B1) is 35. The effect of A2 alone is an increase of 5 units, and the effect of B2 alone is an increase of 3 units. The interaction between A2 and B2 introduces an additional effect of -2 units, meaning the combined effect of A2 and B2 is 2 units below the sum of their individual effects. The corresponding cell means are:$$ \begin{array}{c|c|c} & B1 & B2 \\ \hline A1 & 35 & 38 \\ A2 & 40 & 41 \\ \end{array} $$ 4. Variance among blocks: 11. 5. Error variance: 4. The total variance of the response variable (15) is decomposed into variance between blocks (11) and variance within blocks (4). **Create the RCBD** ```{r} rcbd <- designRCBD( treatments = c(2, 2), blocks = 8, beta = c(35, 5, 3, -2), VarCov = 11, sigma2 = 4 ) ``` **Power for the omnibus test** ```{r} pwr.anova(design = rcbd) ``` **Power for specific contrasts** The power for detecting the effect of facA, both overall and within each level of facB, can be assessed as follows: ```{r} # across all levels of facB pwr.contrast(design = rcbd, specs = ~ "facA", method = "pairwise") # at each level of facB pwr.contrast(design = rcbd, specs = ~ facA|facB, method = "pairwise") ``` **Sample Size Determination** To determine the number of blocks required to achieve the target power, the `find_sample_size` function can be used. First, we create a quoted design object where `blocks = n` remains unevaluated: ```{r} rcbd_quote <- quote( designRCBD( treatments = c(2, 2), blocks = n, beta = c(35, 5, 3, -2), VarCov = 11, sigma2 = 4 ) ) ``` The optimal sample size for the target power within the range of n_init and n_max can be determined as follows: ```{r} find_sample_size(design.quote = rcbd_quote, n_init = 2, n_max = 99) ``` ## Example 3. Latin Square Design In this example, we extend the design from Example 2 by introducing another blocking factor, thus creating an LSD. The treatment structure and effect sizes remain the same as in Example 2. The design controls for two sources of variability (row and column blocks) while evaluating the treatment effects. In the LSD, the total variance (15) is further decomposed into three components: - Variance between row blocks (11), - Variance between column blocks (2), and - Residual error variance (2). In this example, we will customize the labels for the two factors as follows: "temp" with levels "T1" and "T2", and "dosage" with levels "D1" and "D2". **Create the LSD** ```{r} lsd <- designLSD( treatments = c(2, 2), label = list(temp = c("T1", "T2"), dosage = c("D1", "D2")), squares = 4, reuse = "both", beta = c(35, 5, 3, -2), VarCov = list(11, 2), sigma2 = 2 ) ``` **Power for the omnibus test** ```{r} pwr.anova(design = lsd) ``` **Power for specific contrasts** The power for detecting the effect of *dosage*, both overall and within each level of *temp*, can be assessed as follows: ```{r} # the effect of dosage across all levels of temp pwr.contrast(design = lsd, specs = ~ "dosage", method = "pairwise") # the effect of dosage at each level of temp pwr.contrast(design = lsd, specs = ~ dosage|temp, method = "pairwise") ``` **Sample Size Determination** To determine the number of squares required to achieve the target power, the `find_sample_size` function can be used. First, we create a quoted design object where `squares = n` remains unevaluated: ```{r} lsd_quote <- quote( designLSD( treatments = c(2, 2), squares = n, reuse = "both", beta = c(35, 5, 3, -2), VarCov = list(11, 2), sigma2 = 2 ) ) ``` The optimal sample size for the target power within the range of n_init and n_max can be determined as follows: ```{r} find_sample_size(design.quote = lsd_quote, n_init = 2, n_max = 99) ``` ## Example 4: Split-plot Design In this example, we will create a SPD with two treatment factors, one at each level. The design parameters are as follows: 1. Treatments: One main plot factor having 2 levels, and another factor with 3 levels at the sub-plot level. 2. Label: The two treatment factors are labeled as "Main" with levels "Main1" and "Main2," and "Sub" with levels "Sub1" and "Sub2," corresponding to the main plot and subplot, respectively. 3. **Replicates**: There are 10 main plots, with 5 main plots per "Main" treatment level, resulting in 10 experimental units (blocks) for each "Sub" treatment level. 4. The mean of the control (Main1 with Sub1) is 20. The effect of Main2 alone is an increase of 2 units. The effects of Sub2 and Sub3 alone are increases of 2 and 4 units, respectively. The interaction between Main2 and Sub2 is zero. The interaction between Main2 and Sub3 introduces an additional effect of 2 units, meaning the combined effect of Main2 and Sub3 is 2 units above the sum of their individual effects. The corresponding cell means are:$$ \begin{array}{c|c|c} & Sub1 & Sub2 & Sub3 \\ \hline Main1 & 20 & 22 & 24\\ Main2 & 22 & 24 & 28 \\ \end{array} $$ 5. The total variance (15) is assumed to decompose into 4 for whole-plot error and 11 for subplot error. **Create the SPD** ```{r} spd <- designSPD( trt.main = 2, trt.sub = 3, replicates = 10, label = list(Main = c("Main1", "Main2"), Sub = c("Sub1", "Sub2", "Sub3")), beta = c(20, 2, 2, 4, 0, 2), VarCov = list(4), sigma2 = 11 ) ``` **Power for the omnibus test** ```{r} pwr.anova(spd) ``` **Power for specific contrasts** The power for detecting the effect of subplot treatment under each level of main plot treatment, can be assessed as follows: ```{r} pwr.contrast(design = spd, specs = ~ Sub|Main, method = "trt.vs.ctrl") ``` ## Example 5: A customized design In this example, we will create a COD arranged at the "subplot" level of a split-plot layout. Sixteen subjects, split evenly between two breeds (8 subjects per breed), are enrolled in a 4x4 crossover design. Within each breed, the subjects are further divided into two squares (for a total of four squares), with each square consisting of four subjects. Each subject follows one of four treatment sequences and receives treatments across four periods. This hybrid SPD-COD combines elements of a SPD and a COD to evaluate treatment effects across different breeds. The Latin Square structure ensures that within each square, treatments are balanced across periods, while the split-plot element accounts for breed as a whole-plot factor. The overall structure is as follows: - **Treatment structure** - **Whole-plot level**: Breed (2 breeds, with 8 subjects in each breed) - **Sub-plot level**: A 2x2 factorial design with two treatment factors, facA and facB. - **Experimental units** - Whole-plot factor: Subjects (randomly selected from two breeds) - Sub-plot factor: Measurements taken from subjects at each period. - **Variance decomposition** The overall variance (15) is decomposed into three main components: subject variance (7), period variance (4), and random error (4). **Construct the design data frame** To create the design object for this hybrid SPD-COD, we first need to construct a data frame containing all the independent variables, structured to resemble the actual experimental data. This data frame can be generated using the internal function `df.cod`, which builds a data frame for a COD with the specified treatment structure and number of replications. Once the base data frame is generated, a column for the "Breed" variable is manually added. It is important to note that no randomization occurs at this stage—the data frame simply represents the data structure and is not the actual randomized experimental data. Its purpose is to create the design layout. ```{r} df_spd_cod <- pwr4exp:::df.cod( treatments = c(2, 2), squares = 4 ) ## Create main plot factor, i.e., breed df_spd_cod$Breed <- rep(c("1", "2"), each = 32) ## Check data structure head(df_spd_cod, n = 4); tail(df_spd_cod, n = 4) ``` **Specify the model formula** Next, an appropriate model formula must be specified to fit the experimental design. The formula captures the interactions between breed, the two treatment factors (facA and facB), and the random effects for subjects and periods. ```{r} formula <- y ~ Breed*facA*facB + (1|subject) + (1|period) ``` **Fixed effects** We then specify the fixed effects for the model, where beta contains the baseline intercept and the effects of each factor and their interactions. ```{r} beta = c( `(Intercept)` = 35, # Baseline (mean of Breed1_A1_B1) Breed2 = -5, # Effect of the second breed alone facA2 = -5, # Effect of A2 alone facB2 = 1, # Effect of B2 alone `Breed2:facA2` = 1, # Interaction between Breed2 and A2 `Breed2:facB2` = 0, # Interaction between Breed2 and B2 `facA2:facB2` = 2, # Interaction between A2 and B2 `Breed2:facA2:facB2` = 1 # Three-way interaction between Breed2, A2, and B2 ) ``` **Create the design object** After constructing the data frame, model formula, and fixed effects, the design object can be created using the `designCustom` function. The variance-covariance structure and error variance are also provided to complete the design. ```{r} SPD_COD <- designCustom( design.df = df_spd_cod, formula = formula, beta = beta, VarCov = list(7, 4), sigma2 = 4, design.name = "hybrid SPD COD" ) ``` **Power for the omnibus test** ```{r} pwr.anova(SPD_COD) ``` **Power for specific contrasts** We can also calculate the power for specific contrasts. For example, to assess the power for detecting differences between facA1 and facA2 at each level of facB within each breed: ```{r} pwr.contrast(SPD_COD, ~facA|facB|Breed, "pairwise") ``` # References Hrong-Tai Fai, A., & Cornelius, P. L. (1996). Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. Journal of statistical computation and simulation, 54(4), 363-378. Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997. # Acknowledgments The development of **pwr4exp** has benefited greatly from several R packages. Specifically, it leverages and modifies critical functionality from **lmerTest**, **car**, and **emmeans** to determine degrees of freedom (\(v_1\) and \(v_2\)) and the non-centrality parameter (\(\phi\)), making power analysis more accessible and efficient. We are particularly grateful to the authors and contributors of these packages, as well as the broader R community for their valuable tools and resources.