Install and load pwrss R package:

install.packages("pwrss")
library(pwrss)

Alternatively, use the web app below:

If you find the package and related material useful please cite as:

Bulus, M. (2023). pwrss: Statistical Power and Sample Size Calculation Tools. R package version 0.3.1. https://CRAN.R-project.org/package=pwrss

Bulus, M., & Polat, C. (in press). pwrss R paketi ile istatistiksel guc analizi [Statistical power analysis with pwrss R package]. Ahi Evran Universitesi Kirsehir Egitim Fakultesi Dergisi. https://osf.io/ua5fc/download/

Acknowledgements

I would like to thank Roland Thijs and Fred Oswald, who caught some typos in the earlier version of this vignette.

Please send any bug reports, feedback, or questions to bulusmetin [at] gmail.com

1 Generic Functions

These generic functions calculate and return statistical power along with optional Type I and Type II error plots (as long as the test statistic and degrees of freedom is known). These functions are handy because z, t, \(\chi^{2}\), and F statistics and associated degrees of freedom can be readily found in reports, published articles, or standard software output (such as SAS, Stata, SPSS, Jamovi, etc.). Statistical power can be calculated via putting the test statistics for ncp in the functions below; however, do not overinterpret post-hoc power rates.

1.1 t Test

power.t.test(ncp = 1.96, df = 99, alpha = 0.05,
             alternative = "equivalent", plot = TRUE)

##      power ncp.alt ncp.null.1 ncp.null.2 alpha df   t.crit.1  t.crit.2
##  0.2371389       0      -1.96       1.96  0.05 99 -0.3155295 0.3155295

1.2 z Test

power.z.test(ncp = 1.96, alpha = 0.05, 
             alternative = "not equal", plot = TRUE)

##      power ncp.alt ncp.null alpha  z.crit.1 z.crit.2
##  0.5000586    1.96        0  0.05 -1.959964 1.959964

1.3 \(\chi^2\) Test

power.chisq.test(ncp = 15, df = 20,
                 alpha = 0.05, plot = TRUE)

##      power ncp.alt ncp.null alpha df chisq.crit
##  0.6110368      15        0  0.05 20   31.41043

1.4 F Test

power.f.test(ncp = 3, df1 = 2, df2 = 98,
             alpha = 0.05, plot = TRUE)

##      power ncp.alt ncp.null alpha df1 df2   f.crit
##  0.3128778       3        0  0.05   2  98 3.089203

1.5 Multiple Parameters

Multiple parameters are allowed but plots should be turned off (plot = FALSE).

power.t.test(ncp = c(0.50, 1.00, 1.50, 2.00, 2.50), plot = FALSE,
             df = 99, alpha = 0.05, alternative = "not equal")

##       power ncp.alt ncp.null alpha df  t.crit.1 t.crit.2
##  0.07852973     0.5        0  0.05 99 -1.984217 1.984217
##  0.16769955     1.0        0  0.05 99 -1.984217 1.984217
##  0.31785490     1.5        0  0.05 99 -1.984217 1.984217
##  0.50826481     2.0        0  0.05 99 -1.984217 1.984217
##  0.69698027     2.5        0  0.05 99 -1.984217 1.984217

power.z.test(alpha = c(0.001, 0.010, 0.025, 0.050), plot = FALSE,
             ncp = 1.96, alternative = "greater")

##      power ncp.alt ncp.null alpha   z.crit
##  0.1291892    1.96        0 0.001 3.090232
##  0.3570528    1.96        0 0.010 2.326348
##  0.5000144    1.96        0 0.025 1.959964
##  0.6236747    1.96        0 0.050 1.644854

power.chisq.test(df = c(80, 90, 100, 120, 150, 200), plot = FALSE, 
                 ncp = 2, alpha = 0.05)

##       power ncp.alt ncp.null alpha  df chisq.crit
##  0.06989507       2        0  0.05  80   101.8795
##  0.06856779       2        0  0.05  90   113.1453
##  0.06746196       2        0  0.05 100   124.3421
##  0.06571411       2        0  0.05 120   146.5674
##  0.06382959       2        0  0.05 150   179.5806
##  0.06175379       2        0  0.05 200   233.9943

1.6 Type I and Type II Error Plots

plot() function (S3 method) is a wrapper around the generic functions above. Assign results of any pwrss function to an R object and pass it to plot() function.

# comparing two means
design1 <- pwrss.t.2means(mu1 = 0.20, margin = -0.05, paired = TRUE,
                          power = 0.80, alpha = 0.05,
                          alternative = "non-inferior")
plot(design1)

# ANCOVA design
design2 <- pwrss.f.ancova(eta2 = 0.10, n.levels = c(2,3),
                          power = .80, alpha = 0.05)
plot(design2)

# indirect effect in mediation analysis
design3 <- pwrss.z.med(a = 0.10, b = 0.20, cp = 0.10,
                       power = .80, alpha = 0.05)
plot(design3)

2 Mean Difference (t Tests)

2.1 Parametric Tests

2.1.1 Independent Samples t Test

More often than not, unstandardized means and standard deviations are reported in publications for descriptive purposes. Another reason is that they are more intuitive and interpretable (e.g. depression scale). Assume that for the first and second groups expected means are 30 and 28, and expected standard deviations are 12 and 8, respectively.

Calculate Statistical Power

What is the statistical power given that the sample size for the second group is 50 (n2 = 50) and groups have equal sample sizes (kappa = n1 / n2 = 1)?

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               n2 = 50, alpha = 0.05,
               alternative = "not equal")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.163 
##   n1 = 50 
##   n2 = 50 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 98 
##  Non-centrality parameter = 0.981 
##  Type I error rate = 0.05 
##  Type II error rate = 0.837

Calculate Minimum Required Sample Size

What is the minimum required sample size given that groups have equal sample sizes (kappa = 1)? (\(\kappa = n_1 / n_2\))

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 410 
##   n2 = 410 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 818 
##  Non-centrality parameter = 2.808 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

It is sufficient to put pooled standard deviation for sd1 because sd2 = sd1 by default. In this case, for a pooled standard deviation of 10.198 the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.198, kappa = 1,
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 410 
##   n2 = 410 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 818 
##  Non-centrality parameter = 2.808 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

It is sufficient to put Cohen’s d or Hedge’s g (standardized difference between two groups) for mu1 because mu2 = 0, sd1 = 1, and sd2 = sd1 by default. For example, for an effect size as small as 0.196 (based on previous example) the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 0.196, kappa = 1,
               power = .80, alpha = 0.05, 
               alternative = "not equal")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 410 
##   n2 = 410 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 818 
##  Non-centrality parameter = 2.806 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

2.1.2 Paired Samples t Test

Assume for the first (e.g. pretest) and second (e.g. posttest) time points expected means are 30 and 28 (a reduction of 2 points), and expected standard deviations are 12 and 8, respectively. Also assume a correlation of 0.50 between first and second measurements (by default paired.r = 0.50). What is the minimum required sample size?

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Difference between Two means 
##  (Paired Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 222 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 221 
##  Non-centrality parameter = 2.816 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

It is sufficient to put standard deviation of the difference for sd1 because sd2 = sd1 by default. In this case, for a standard deviation of difference of 10.583 the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.583, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Difference between Two means 
##  (Paired Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 222 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 221 
##  Non-centrality parameter = 2.816 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

It is sufficient to put Cohen’s d or Hedge’s g (standardized difference between two time points) for mu1 because mu2 = 0, sd1 = sqrt(1/(2*(1-paired.r))), and sd2 = sd1 by default. For example, for an effect size as small as 0.1883 (based on previous example) the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 0.1883, paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05, 
               alternative = "not equal")

##  Difference between Two means 
##  (Paired Samples t Test) 
##  H0: mu1 = mu2 
##  HA: mu1 != mu2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 224 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 223 
##  Non-centrality parameter = 2.818 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

2.2 Non-parametric Tests

The variable of interest for which means are compared may not follow normal distribution. One reason is that the sample size could be small. Another reason is that the parent distribution (distribution in the population) may not be normal (e.g., could be uniform, exponential, double exponential, or logistic). In these cases t test could produce biased results.

For non-parametric tests use pwrss.np.2groups() function instead of pwrss.t.2means(). All arguments are the same. Additionally, however, the parent distribution can be specified as “normal”, “uniform”, “exponential”, “double exponential”, or “logistic”. In the following examples the parent distribution is “normal” (not shown because it is the default) and can be changed with the dist or distribution argument (e.g. dist = "uniform").

Although means and standard deviations are some of the arguments in the function below, what is actually being tested is the difference between mean ranks. First, the mean difference is converted into Cohen’s d, and then into probability of superiority, which is the probability of an observation in group 1 being higher than an observation in group 2. This parameterization, expressed as means and standard deviations, helps in making comparisons and switching back and forth between parametric and non-parametric tests.

2.2.1 Independent Samples (Wilcoxon-Mann-Whitney Test)

The example below uses the same parameters as the example in the independent t test section.

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Non-parametric Difference between Two Groups (Independent samples) 
##  Mann-Whitney U or Wilcoxon Rank-sum Test 
##  (a.k.a Wilcoxon-Mann-Whitney Test) 
##  Method: GUENTHER 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 429 
##   n2 = 429 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = 2.805 
##  Degrees of freedom = 816.21 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

2.2.2 Paired Samples (Wilcoxon Signed-rank Test)

The example below uses the same parameters as the example in the paired (dependent) t test section.

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

##  Non-parametric Difference between Two Groups (Dependent samples) 
##  Wilcoxon signed-rank Test for Matched Pairs 
##  Method: GUENTHER 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 233 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = 2.814 
##  Degrees of freedom = 220.7 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

It is sufficient to put Cohen’s d or Hedge’s g (standardized difference between two groups or measurements) for mu1 without specifying mu2, sd1, and sd2.

2.3 Non-inferiority, Superiority, and Equivalence

These tests are useful for testing practically significant difference (non-inferiority/superiority) or practically null difference (equivalence).

2.3.1 Parametric Tests

Non-inferiority: The mean of group 1 is practically not smaller than the mean of group 2. The mu1 - mu2 difference can be as small as -1 (margin = -1) but it will still be considered non-inferior. What is the minimum required sample size?

When higher values of an outcome is better the margin takes NEGATIVE values; whereas when lower values of the outcome is better margin takes POSITIVE values.

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = -1, power = 0.80,
               alternative = "non-inferior")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 - mu2 <= margin 
##  HA: mu1 - mu2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 144 
##   n2 = 144 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Degrees of freedom = 286 
##  Non-centrality parameter = 2.496 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority: The mean of group 1 is practically greater than the mean of group 2. The mu1 - mu2 difference is at least greater than 1 (margin = 1). What is the minimum required sample size?

When higher values of an outcome is better margin takes POSITIVE values; whereas when lower values of the outcome is better margin takes NEGATIVE values.

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "superior")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: mu1 - mu2 <= margin 
##  HA: mu1 - mu2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1287 
##   n2 = 1287 
##  ------------------------------ 
##  Alternative = "superior" 
##  Degrees of freedom = 2572 
##  Non-centrality parameter = 2.487 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Equivalence: The mean of group 1 is practically same as mean of group 2. The mu1 - mu2 difference can be as small as -1 and as high as 1 (margin = 1). What is the minimum required sample size?

Specify the absolute value for the margin.

pwrss.t.2means(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "equivalent")

##  Difference between Two means 
##  (Independent Samples t Test) 
##  H0: |mu1 - mu2| >= margin 
##  HA: |mu1 - mu2| < margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1783 
##   n2 = 1783 
##  ------------------------------ 
##  Alternative = "equivalent" 
##  Degrees of freedom = 3564 
##  Non-centrality parameter = -2.928 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

2.3.2 Non-parametric Tests

Non-inferiority: The mean of group 1 is practically not smaller than the mean of group 2. The mu1 - mu2 difference can be as small as -1 (margin = -1). What is the minimum required sample size?

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = -1, power = 0.80,
               alternative = "non-inferior")

##  Non-parametric Difference between Two Groups (Independent samples) 
##  Mann-Whitney U or Wilcoxon Rank-sum Test 
##  (a.k.a Wilcoxon-Mann-Whitney Test) 
##  Method: GUENTHER 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 151 
##   n2 = 151 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Non-centrality parameter = 2.492 
##  Degrees of freedom = 285.13 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority: The mean of group 1 is practically greater than the mean of group 2. The mu1 - mu2 difference is at least greater than 1 (margin = 1). What is the minimum required sample size?

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "superior")

##  Non-parametric Difference between Two Groups (Independent samples) 
##  Mann-Whitney U or Wilcoxon Rank-sum Test 
##  (a.k.a Wilcoxon-Mann-Whitney Test) 
##  Method: GUENTHER 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1348 
##   n2 = 1348 
##  ------------------------------ 
##  Alternative = "superior" 
##  Non-centrality parameter = 2.487 
##  Degrees of freedom = 2571.3 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

pwrss.np.2groups(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "equivalent")

##  Non-parametric Difference between Two Groups (Independent samples) 
##  Mann-Whitney U or Wilcoxon Rank-sum Test 
##  (a.k.a Wilcoxon-Mann-Whitney Test) 
##  Method: GUENTHER 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1867 
##   n2 = 1867 
##  ------------------------------ 
##  Alternative = "equivalent" 
##  Non-centrality parameter = -2.927 
##  Degrees of freedom = 3561.91 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

3 Linear Regression (F and t Tests)

3.1 Omnibus \(F\) Test

3.1.1 \(R^2 > 0\) in Linear Regression

Omnibus F test in multiple liner regression is used to test whether \(R^2\) is greater than 0 (zero). Assume that we want to predict a continuous variable \(Y\) using \(X_{1}\), \(X_{2}\), and \(X_{2}\) variables (a combination of binary or continuous).

\[\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]

We are expecting that these three variables explain 30% of the variance in the outcome (\(R^2 = 0.30\) or r2 = 0.30 in the code). What is the minimum required sample size?

pwrss.f.reg(r2 = 0.30, k = 3, power = 0.80, alpha = 0.05)

##  Linear Regression (F test) 
##  R-squared Deviation from 0 (zero) 
##  H0: r2 = 0 
##  HA: r2 > 0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 30 
##  ------------------------------ 
##  Numerator degrees of freedom = 3 
##  Denominator degrees of freedom = 25.653 
##  Non-centrality parameter = 12.709 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

3.1.2 \(\Delta R^2 > 0\) in Hierarchical Regression

Assume that we want to add two more predictors to the earlier regression model (\(X_{4}\), and \(X_{5}\); thus m = 2) and test whether increase in \(R^2\) is greater than 0 (zero). In total there are five predictors in the model (k = 5). By adding two predictors we are expecting an increase of \(\Delta R^2 = 0.15\). What is the minimum required sample size?

\[\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + \beta_{4}X_{4} + \beta_{5}X_{5} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]

pwrss.f.reg(r2 = 0.15, k = 5, m = 2, power = 0.80, alpha = 0.05)

##  Hierarchical Linear Regression (F test) 
##  R-squared Change 
##  H0: r2 = 0 
##  HA: r2 > 0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 58 
##  ------------------------------ 
##  Numerator degrees of freedom = 2 
##  Denominator degrees of freedom = 51.876 
##  Non-centrality parameter = 10.213 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

3.2 Single Regression Coefficient (z or t Test)

3.2.1 Standardized versus Unstandardized Input

In the earlier example, assume that we want to predict a continuous variable \(Y\) using a continuous predictor \(X_{1}\) but control for \(X_{2}\), and \(X_{2}\) variables (a combination of binary or continuous). We are mainly interested in the effect of \(X_{1}\) and expect a standardized regression coefficient of \(\beta_{1} = 0.20\).

\[\begin{eqnarray} Y &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]

Again, we are expecting that these three variables explain 30% of the variance in the outcome (\(R^2 = 0.30\)). What is the minimum required sample size? It is sufficient to provide standardized regression coefficient for beta1 because sdx = 1 and sdy = 1 by default.

pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30, 
            power = .80, alpha = 0.05, alternative = "not equal")

##  Linear Regression Coefficient (t Test) 
##  H0: beta1 = beta0 
##  HA: beta1 != beta0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 140 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 135.331 
##  Non-centrality parameter = 2.822 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

For unstandardized coefficients specify sdy and sdx. Assume we are expecting an unstandardized regression coefficient of beta1 = 0.60, a standard deviation of sdy = 12 for the outcome and a standard deviation of sdx = 4 for the main predictor. What is the minimum required sample size?

pwrss.t.reg(beta1 = 0.60, sdy = 12, sdx = 4, k = 3, r2 = 0.30, 
            power = .80, alpha = 0.05, alternative = "not equal")

##  Linear Regression Coefficient (t Test) 
##  H0: beta1 = beta0 
##  HA: beta1 != beta0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 140 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 135.331 
##  Non-centrality parameter = 2.822 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

If the main predictor is binary (e.g. treatment/control), the standardized regression coefficient is Cohen’s d. Standard deviation of the main predictor is \(\sqrt{p(1-p)}\) where p is the proportion of sample in one of the groups. Assume half of the sample is in the first group \(p = 0.50\). What is the minimum required sample size? It is sufficient to provide Cohen’s d for beta1 (standardized difference between two groups) but specify sdx = sqrt(p*(1-p)) where p is the proportion of subjects in one of the groups.

p <- 0.50
pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "not equal")

##  Linear Regression Coefficient (t Test) 
##  H0: beta1 = beta0 
##  HA: beta1 != beta0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 552 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Degrees of freedom = 547.355 
##  Non-centrality parameter = 2.807 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

3.2.2 Non-inferiority, Superiority, and Equivalence

These tests are useful for testing practically significant effects (non-inferiority/superiority) or practically null effects (equivalence).

Non-inferiority: The intervention is expected to be non-inferior to some earlier or other interventions. Assume that the effect of an earlier or some other intervention is beta0 = 0.10. The beta1 - beta0 is expected to be positive and should be at least -0.05 (margin = -0.05). What is the minimum required sample size?

This is the case when higher values of an outcome is better. When lower values of an outcome is better the beta1 - beta0 difference is expected to be NEGATIVE and the margin takes POSITIVE values.

p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = -0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "non-inferior")

##  Linear Regression Coefficient (t Test) 
##  H0: beta1 - beta0 <= margin 
##  HA: beta1 - beta0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 771 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Degrees of freedom = 766.745 
##  Non-centrality parameter = 2.489 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority: The intervention is expected to be superior to some earlier or other interventions. Assume that the effect of an earlier or some other intervention is beta0 = 0.10. The beta1 - beta0 is expected to be positive and should be at least 0.05 (margin = 0.05). What is the minimum required sample size?

This is the case when higher values of an outcome is better. When lower values of an outcome is better beta1 - beta0 difference is expected to be NEGATIVE and the margin takes NEGATIVE values.

p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = 0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "superior")

##  Linear Regression Coefficient (t Test) 
##  H0: beta1 - beta0 <= margin 
##  HA: beta1 - beta0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 6926 
##  ------------------------------ 
##  Alternative = "superior" 
##  Degrees of freedom = 6921.818 
##  Non-centrality parameter = 2.487 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Equivalence: The intervention is expected to be equivalent to some earlier or other interventions. Assume the effect of an earlier or some other intervention is beta0 = 0.20. The beta1 - beta0 is expected to be within -0.05 and 0.05 (margin = 0.05). What is the minimum required sample size?

margin always takes positive values for equivalence. Specify the absolute value.

p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.20, margin = 0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "equivalent")

##  Linear Regression Coefficient (t Test) 
##  H0: |beta1 - beta0| >= margin 
##  HA: |beta1 - beta0| < margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 9593 
##  ------------------------------ 
##  Alternative = "equivalent" 
##  Degrees of freedom = 9588.862 
##  Non-centrality parameter = -2.927 2.927 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

4 Logistic Regression (Wald’s z Test)

In logistic regression a binary outcome variable (0/1: failed/passed, dead/alive, absent/present) is modeled by predicting probability of being in group 1 (\(P_1\)) via logit transformation (natural logarithm of odds). The base probability \(P_0\) is the overall probability of being in group 1 without influence of predictors in the model (null). Under alternative hypothesis, the probability of being in group 1 (\(P_1\)) deviate from \(P_0\) depending on the value of the predictor; whereas under null it is same as the \(P_0\). A model with one main predictor (\(X_1\)) and two other covariates (\(X_2\) and \(X_3\)) can be constructed as

\[\begin{eqnarray} ln(\frac{P_1}{1- P_1}) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}\]

where
\[\beta_0 = ln(\frac{P_0}{1- P_0})\]
\[\beta_1 = ln(\frac{P_1}{1- P_1} / \frac{P_0}{1- P_0})\]
Odds ratio is defined as \[OR = exp(\beta_1) = \frac{P_1}{1- P_1} / \frac{P_0}{1- P_0}\]

Example:

Assume

A squared multiple correlation of 0.20 between \(X_1\) and other covariates (r2.other.x = 0.20 in the code). It can be found in the form of adjusted R-square via regressing \(X_1\) on \(X_2\) and \(X_3\). Higher values require larger sample sizes. The default is 0 (zero).
A base probability of \(P_0 = 0.15\). This is the rate when predictor \(X_1 = 0\) or when \(\beta_1 = 0\).
Increasing \(X_1\) from 0 to 1 reduces the probability of being in group 1 from 0.15 to 0.10 (\(P_1 = 0.10\)).

What is the minimum required sample size? There are three types of specification to statistical power or sample size calculations; (i) probability specification, (ii) odds ratio specification, and (iii) regression coefficient specification (as in standard software output).

Probability specification:

pwrss.z.logreg(p0 = 0.15, p1 = 0.10, r2.other.x = 0.20,
               power = 0.80, alpha = 0.05, 
               dist = "normal")

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 365 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.766 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Odds ratio specification: \[OR = \frac{P_1}{1-P1} / \frac{P_0}{1-P_0} = \frac{0.10}{1-0.10} / \frac{0.15}{1-0.15} = 0.6296\]

pwrss.z.logreg(p0 = 0.15, odds.ratio = 0.6296, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "normal")

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 365 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.766 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Regression coefficient specification: \[\beta_1 = ln(\frac{P_1}{1-P1} / \frac{P_0}{1-P_0}) = ln(0.6296) = -0.4626\]

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "normal")

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 365 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.766 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Change the distribution’s parameters for predictor X:

The mean and standard deviation of a normally distributed main predictor is 0 and 1 by default. They can be modified. In the following example the mean is 25 and the standard deviation is 8.

dist.x <- list(dist = "normal", mean = 25, sd = 8)

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = dist.x)

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 2435 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.423 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Change the distribution family of predictor X:

More distribution types are supported by the function. For example, the main predictor can be binary (e.g. treatment/control groups). Often half of the sample is assigned to the treatment group and the other half to the control (prob = 0.50 by default).

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "bernoulli")

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'bernoulli' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 1723 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.789 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Change the treatment group allocation rate of the binary predictor X (prob = 0.40):

The per-subject cost in a treatment group is sometimes substantially higher than the per-subject cost in control group. At other times, it is difficult to recruit subjects for a treatment whereas there are plenty of subjects to be considered for the control group (e.g. a long wait-list). In such cases there is some leeway to pick an unbalanced sample (but not too much unbalanced). Assume the treatment group allocation rate is 40%. What is the minimum required sample size?

dist.x <- list(dist = "bernoulli", prob = 0.40)

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = dist.x)

##  Logistic Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'bernoulli' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 1826 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.766 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

5 Poisson Regression (Wald’s z Test)

In Poisson regression a count outcome variable (e.g. number of hospital/store/website visits, number of absence/dead/purchase in a day/week/month) is modeled by predicting incidence rate (\(\lambda\)) via logarithmic transformation (natural logarithm of rates). A model with one main predictor (\(X_1\)) and two other covariates (\(X_2\) and \(X_3\)) can be constructed as

\[\begin{eqnarray} ln(\lambda) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}\]

where \(exp(\beta_0)\) is the base incidence rate.
\[\beta_1 = ln(\frac{\lambda(X_1=1)}{\lambda(X_1=0)})\]
Incidence rate ratio is defined as \[exp(\beta_1) = \frac{\lambda(X_1=1)}{\lambda(X_1=0)}\]

Assume

The expected base incidence rate is 1.65: \(exp(0.50) = 1.65\).
Increasing \(X_1\) from 0 to 1 reduces the mean incidence rate from 1.65 to 0.905: \(exp(-0.10) = 0.905\).

What is the minimum required sample size? There are two types of specification; (i) rate ratio specification (exponentiated regression coefficient), and (ii) raw regression coefficient specification (as in standard software output).

Regression coefficient specification:

pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
                power = 0.80, alpha = 0.05, 
                dist = "normal")

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 474 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Rate ratio specification:

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                power = 0.80, alpha = 0.05, 
                dist = "normal")

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 474 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Change the distribution’s parameters for predictor X:

dist.x <- list(dist = "normal", mean = 25, sd = 8)

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = dist.x)

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'normal' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 66 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

The function accomodates other types of distribution. For example, the main predictor can be binary (e.g. treatment/control groups).

Regression coefficient specification:

pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
                alpha = 0.05, power = 0.80,
                dist = "bernoulli")

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'bernoulli' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 2003 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.801 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Rate ratio specification:

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = "bernoulli")

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'bernoulli' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 2003 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.801 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Change treatment group allocation rate of the binary predictor X (prob = 0.40):

dist.x <- list(dist = "bernoulli", prob = 0.40)

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = dist.x)

##  Poisson Regression Coefficient 
##  (Large Sample Approx. Wald's z Test) 
##  H0: beta1 = 0 
##  HA: beta1 != 0 
##  Distribution of X = 'bernoulli' 
##  Method = DEMIDENKO(VC) 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 2095 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.792 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

6 Indirect Effect (Sobel’s z Test)

A simple mediation model can be constructed as in the figure.

Simple Mediation Model

Regression models take the form of

\[\begin{eqnarray} M &=& \beta_{0M} + \color{red} {a} X + e\newline Y &=& \beta_{0Y} + \color{red} {b} M + \color{red} {cp} X + \epsilon \newline \end{eqnarray}\]

Y is the outcome, M is the mediator, and X is the main predictor. The indirect effect is the product of a and b path coefficients. cp is the path coefficient for the direct effect. Path coefficients can be standardized or unstandardized. They are presumed to be standardized by default because the main predictor, mediator, and outcome all have standard deviations of sdx = 1, sdm = 1, and sdy = 1 in the function. Standard deviations should be specified for unstandardized path coefficients (you can find these values in descriptive tables in reports/publications).

X is continuous:

Most software applications presume no covariates in the mediator and outcome models (\(R^2_M = 0\) and \(R^2_M = 0\)). Even with no covariates, X explain some of the variance in M (\(R^2_M > 0\)) and M & X explains some of the variance in Y (\(R^2_Y > 0\)). We will almost never have an R-squared value of 0 (zero). The explained variance in the basic mediation model (the base R-squared values) can be non-trivial and is taken into account in the function by default. Thus, results may seem different from other software outputs.

# mediation model with base R-squared values
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            power = 0.80, alpha = 0.05)

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel   0.8 242 2.802
##       Aroian   0.8 250 2.802
##      Goodman   0.8 235 2.802
##        Joint    NA  NA    NA
##  Monte Carlo    NA  NA    NA
## ---------------------------- 
##  Type I error rate = 0.05

To match results with other software packages, explicitly specify these parameters as 0 (zero) (r2m = 0 and r2y = 0). There will be some warnings indicating that the function expects R-squared values greater than the base R-squared value. Note that in this case cp argument will be ignored.

# base R-squared values are 0 (zero)
# do not specify 'cp' 
pwrss.z.med(a = 0.25, b = 0.25, 
            r2m = 0, r2y = 0,
            power = 0.80, alpha = 0.05)

## Warning: specified 'r2m.x' is smaller than the base 'r2m.x'

## Warning: specified 'r2y.mx' is smaller than the base 'r2y.mx'

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel   0.8 252 2.802
##       Aroian   0.8 259 2.802
##      Goodman   0.8 243 2.802
##        Joint    NA  NA    NA
##  Monte Carlo    NA  NA    NA
## ---------------------------- 
##  Type I error rate = 0.05

X is binary:

The case where the main predictor is binary can be useful in practice. A researcher might be interested in whether treatment influence the outcome through some mediators.

p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10, 
            sdx = sqrt(p*(1-p)), 
            power = 0.80, alpha = 0.05)

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel   0.8 614 2.802
##       Aroian   0.8 626 2.802
##      Goodman   0.8 602 2.802
##        Joint    NA  NA    NA
##  Monte Carlo    NA  NA    NA
## ---------------------------- 
##  Type I error rate = 0.05

Statistical power for joint and Monte Carlo interval tests

Joint and Monte Carlo tests are only available when power is requested.

# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10, 
            sdx = sqrt(p*(1-p)), 
            n = 300, alpha = 0.05)

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel 0.500 300 1.959
##       Aroian 0.484 300 1.920
##      Goodman 0.516 300 2.000
##        Joint 0.584 300    NA
##  Monte Carlo 0.565 300    NA
## ---------------------------- 
##  Type I error rate = 0.05

Explanatory Power of Covariates

Covariates can be added to the mediator model, outcome model, or both. Explanatory power of the covariates (R-squared values) for the mediator and outcome models can be specified via r2m.x and r2y.mx arguments.

# continuous X
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            r2m = 0.50, r2y = 0.50, 
            power = 0.80, alpha = 0.05)

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel   0.8 189 2.802
##       Aroian   0.8 194 2.802
##      Goodman   0.8 183 2.802
##        Joint    NA  NA    NA
##  Monte Carlo    NA  NA    NA
## ---------------------------- 
##  Type I error rate = 0.05

In experimental design subjects are randomly assigned to treatment and control groups. The allocation usually takes place with a rate of 50% (or probability of 0.50) meaning that half of the sample is assigned to treatment and the other half to control. Thus, the mediator model is less likely to have confounder. It is more common to add covariates to the outcome model only.

# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            sdx = sqrt(p*(1-p)), r2y = 0.50, 
            power = 0.80, alpha = 0.05)

##  Indirect Effect in Mediation Model
##  H0: a * b = 0 
##  HA: a * b != 0 
##  --------------------------- 
##         test power   n   ncp
##        Sobel   0.8 559 2.802
##       Aroian   0.8 566 2.802
##      Goodman   0.8 551 2.802
##        Joint    NA  NA    NA
##  Monte Carlo    NA  NA    NA
## ---------------------------- 
##  Type I error rate = 0.05

7 Analysis of (Co)Variance (F Test)

ANOVA: Analysis of Variance
ANCOVA: Analysis of Covariance

In a one-way ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to a continuous variable. In a one-way ANCOVA, they may want to adjust means for covariates. Furthermore, they may want to inspect interaction between two or three factors (two-way or three-way ANOVA), or adjust interaction for covariates (two-way or three-way ANCOVA). The following examples illustrate how to determine minimum required sample size for such designs.

7.1 One-way

7.1.1 ANOVA

A researcher is expecting a difference of Cohen’s d = 0.50 between treatment and control groups (two levels) translating into \(\eta^2 = 0.059\) (eta2 = 0.059). Means are not adjusted for any covariates. What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.059, n.levels = 2,
               power = .80, alpha = 0.05)

##  One-way Analysis of Variance (ANOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  --------------------------------------
##  effect power n.total   ncp df1     df2
##       A   0.8     128 7.971   1 125.132
##  --------------------------------------
##  Type I error rate: 0.05

7.1.2 ANCOVA

A researcher is expecting an adjusted difference of Cohen’s d = 0.45 between treatment and control groups (n.levels = 2) after controlling for the pretest (n.cov = 1) translating into partial \(\eta^2 = 0.048\) (eta2 = 0.048). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.048, n.levels = 2, n.cov = 1,
               alpha = 0.05, power = .80)

##  One-way Analysis of Covariance (ANCOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  --------------------------------------
##  effect power n.total   ncp df1     df2
##       A   0.8     158 7.948   1 154.626
##  --------------------------------------
##  Type I error rate: 0.05

n.cov (or n.covariates) argument has trivial effect on the results. The difference between ANOVA and ANCOVA procedure depends on whether the effect (eta2) is unadjusted or covariate-adjusted.

7.2 Two-way

7.2.1 ANOVA

A researcher is expecting a partial \(\eta^2 = 0.03\) (eta2 = 0.03) for interaction of treatment/control (Factor A: two levels) with gender (Factor B: two levels). Thus, n.levels = c(2,2). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.03, n.levels = c(2,2),
               alpha = 0.05, power = 0.80)

##  Two-way Analysis of Variance (ANOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  Factor B: 2 levels 
##  --------------------------------------
##  effect power n.total   ncp df1    df2
##       A   0.8     256 7.909   1 251.73
##       B   0.8     256 7.909   1 251.73
##   A x B   0.8     256 7.909   1 251.73
##  --------------------------------------
##  Type I error rate: 0.05

7.2.2 ANCOVA

A researcher is expecting a partial \(\eta^2 = 0.02\) (eta2 = 0.02) for interaction of treatment/control (Factor A) with gender (Factor B) adjusted for the pretest (n.cov = 1). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2), n.cov = 1,
               alpha = 0.05, power = .80)

##  Two-way Analysis of Covariance (ANCOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  Factor B: 2 levels 
##  --------------------------------------
##  effect power n.total   ncp df1     df2
##       A   0.8     387 7.889   1 381.539
##       B   0.8     387 7.889   1 381.539
##   A x B   0.8     387 7.889   1 381.539
##  --------------------------------------
##  Type I error rate: 0.05

7.3 Three-way

7.3.1 ANOVA

A researcher is expecting a partial \(\eta^2 = 0.02\) (eta2 = 0.02) for interaction of treatment/control (Factor A: two levels), gender (Factor B: two levels), and socio-economic status (Factor C: three levels). Thus, n.levels = c(2,2,3). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2,3),
               alpha = 0.05, power = 0.80)

##  Three-way Analysis of Variance (ANOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  Factor B: 2 levels 
##  Factor C: 3 levels 
##  --------------------------------------
##     effect power n.total   ncp df1     df2
##          A   0.8     387 7.889   1 374.576
##          B   0.8     387 7.889   1 374.576
##          C   0.8     476 9.697   2 463.167
##      A x B   0.8     387 7.889   1 374.576
##      A x C   0.8     476 9.697   2 463.167
##      B x C   0.8     476 9.697   2 463.167
##  A x B x C   0.8     476 9.697   2 463.167
##  --------------------------------------
##  Type I error rate: 0.05

7.3.2 ANCOVA

A researcher is expecting a partial \(\eta^2 = 0.01\) (eta2 = 0.01) for interaction of treatment/control (Factor A), gender (Factor B), and socio-economic status (Factor C: three levels) adjusted for the pretest (n.cov = 1). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.01, n.levels = c(2,2,3), n.cov = 1,
               alpha = 0.05, power = .80)

##  Three-way Analysis of Covariance (ANCOVA) 
##   H0: 'eta2' or 'f2' = 0 
##   HA: 'eta2' or 'f2' > 0 
##  --------------------------------------
##  Factor A: 2 levels 
##  Factor B: 2 levels 
##  Factor C: 3 levels 
##  --------------------------------------
##     effect power n.total   ncp df1     df2
##          A   0.8     779 7.869   1 765.990
##          B   0.8     779 7.869   1 765.990
##          C   0.8     957 9.665   2 943.868
##      A x B   0.8     779 7.869   1 765.990
##      A x C   0.8     957 9.665   2 943.868
##      B x C   0.8     957 9.665   2 943.868
##  A x B x C   0.8     957 9.665   2 943.868
##  --------------------------------------
##  Type I error rate: 0.05

8 Repeated Measures (F Test)

The group (between) effect: In a one-way repeated measures ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to an outcome variable (continuous) after controlling for the effect of time. The time effect can be thought as the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.

The time (within) effect: A researcher might be interested in comparing the means of several time points with respect to the outcome variable after controlling for the group effect. In other words, the change across time points is not due to the group membership but due to the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.

The group x time interaction: A researcher might suspect that means across groups and means across time points are not independent from each other. The amount of improvement/grow/loss/deterioration in the outcome variable depends the group membership.

8.1 The Group Effect (Between)

Example 1: Posttest only design with treatment and control groups.

A researcher is expecting a difference of Cohen’s d = 0.50 on the posttest score between treatment and control groups (n.levels = 2), translating into \(\eta^2 = 0.059\) (eta2 = 0.059). The test is administered at a single time point; thus, the ‘number of repeated measures’ is n.rm = 1. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.059,  n.levels = 2, n.rm = 1,
                power = 0.80, alpha = 0.05,
                type = "between")

##  One-way Repeated Measures 
##  Analysis of Variance (F test) 
##  H0: eta2 = 0 (or f2 = 0) 
##  HA: eta2 > 0 (or f2 > 0) 
##  ------------------------------ 
##  Number of levels (groups) = 2 
##  Number of repeated measurements = 1 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 128 
##  ------------------------------ 
##  Type of the effect = "between" 
##  Numerator degrees of freedom = 1 
##  Denominator degrees of freedom = 125.132 
##  Non-centrality parameter = 7.971 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

8.2 The Time Effect (Within)

Example 2: Pretest-posttest design with treatment group only.

A researcher is expecting a difference of Cohen’s d = 0.30 between posttest and pretest scores, translating into \(\eta^2 = 0.022\) (eta2 = 0.022). The test is administered before and after the treatment; thus, the ‘number of repeated measures’ is n.rm = 2. There is treatment group but no control group (n.levels = 1). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.022,  n.levels = 1, n.rm = 2,
                power = 0.80, alpha = 0.05,
                corr.rm = 0.50, type = "within")

##  One-way Repeated Measures 
##  Analysis of Variance (F test) 
##  H0: eta2 = 0 (or f2 = 0) 
##  HA: eta2 > 0 (or f2 > 0) 
##  ------------------------------ 
##  Number of levels (groups) = 1 
##  Number of repeated measurements = 2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 90 
##  ------------------------------ 
##  Type of the effect = "within" 
##  Numerator degrees of freedom = 1 
##  Denominator degrees of freedom = 88.169 
##  Non-centrality parameter = 8.023 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

8.3 The Group x Time Interaction

Example 3: Pretest-posttest control-group design.

A researcher is expecting a difference of Cohen’s d = 0.40 on the posttest scores between treatment and control groups (n.levels = 2) after controlling for pretest, translating into partial \(\eta^2 = 0.038\) (eta2 = 0.038). The test is administered before and after the treatment; thus, the ‘number of repeated measures’ is n.rm = 2. The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.038,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "between")

##  One-way Repeated Measures 
##  Analysis of Variance (F test) 
##  H0: eta2 = 0 (or f2 = 0) 
##  HA: eta2 > 0 (or f2 > 0) 
##  ------------------------------ 
##  Number of levels (groups) = 2 
##  Number of repeated measurements = 2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 151 
##  ------------------------------ 
##  Type of the effect = "between" 
##  Numerator degrees of freedom = 1 
##  Denominator degrees of freedom = 148.97 
##  Non-centrality parameter = 7.951 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

A researcher is expecting a difference of Cohen’s d = 0.30 between posttest and pretest scores (n.rm = 2) after controlling for group membership, translating into partial \(\eta^2 = 0.022\) (eta2 = 0.022). There is both treatment and control groups (n.levels = 2). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.022,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "within")

##  One-way Repeated Measures 
##  Analysis of Variance (F test) 
##  H0: eta2 = 0 (or f2 = 0) 
##  HA: eta2 > 0 (or f2 > 0) 
##  ------------------------------ 
##  Number of levels (groups) = 2 
##  Number of repeated measurements = 2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 90 
##  ------------------------------ 
##  Type of the effect = "within" 
##  Numerator degrees of freedom = 1 
##  Denominator degrees of freedom = 87.191 
##  Non-centrality parameter = 8.025 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

The rationale for inspecting the interaction is that the benefit of the treatment may depend on the pretest score (e.g. those with higher scores on the pretest improve or deteriorate more). A researcher is expecting an interaction effect of partial \(\eta^2 = 0.01\) (eta2 = 0.01). The test is administered before and after the treatment; thus, the ‘number of repeated measures’ is n.rm = 2. There is both treatment and control groups (n.levels = 2). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.01,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "interaction")

##  One-way Repeated Measures 
##  Analysis of Variance (F test) 
##  H0: eta2 = 0 (or f2 = 0) 
##  HA: eta2 > 0 (or f2 > 0) 
##  ------------------------------ 
##  Number of levels (groups) = 2 
##  Number of repeated measurements = 2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 197 
##  ------------------------------ 
##  Type of the effect = "interaction" 
##  Numerator degrees of freedom = 1 
##  Denominator degrees of freedom = 194.199 
##  Non-centrality parameter = 7.927 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

9 GOF or Independence (\(\chi^2\) Test)

9.1 Vector of k Cells

Effect size: Cohen’s W for goodness-of-fit test (1 x k or k x 1 table).

How many subjects are needed to claim that girls choose STEM related majors less than males? Check the original article at https://www.aauw.org/resources/research/the-stem-gap/

Option 1: Use cell probabilities
Alternative hypothesis state that 28% of the workforce in STEM field is women whereas 72% is men (from the article).
The null hypothesis assume 50% is women and 50% is men.

prob.mat <- c(0.28, 0.72) 
pwrss.chisq.gofit(p1 = c(0.28, 0.72), 
                  p0 = c(0.50, 0.50),
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 41 
##  ------------------------------ 
##  Degrees of freedom = 1 
##  Non-centrality parameter = 7.849 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Option 2: Use Cohen’s W = 0.44.
Degrees of freedom is k - 1 for Cohen’s W.

pwrss.chisq.gofit(w = 0.44, df = 1,
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 41 
##  ------------------------------ 
##  Degrees of freedom = 1 
##  Non-centrality parameter = 7.849 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

9.2 Contingency Table: 2 x 2

Effect size: Phi Coefficient (or Cramer’s V or Cohen’s W) for independence test (2 x 2 contingency table).

How many subjects are needed to claim that girls are underdiagnosed with ADHD?
Check the original article at https://time.com/growing-up-with-adhd/

Option 1: Use cell probabilities.
5.6% of girls and 13.2% of boys are diagnosed with ADHD (from the article).

prob.mat <- rbind(c(0.056, 0.132),
                  c(0.944, 0.868))
colnames(prob.mat) <- c("Girl", "Boy")
rownames(prob.mat) <- c("ADHD", "No ADHD")
prob.mat

##          Girl   Boy
## ADHD    0.056 0.132
## No ADHD 0.944 0.868

pwrss.chisq.gofit(p1 = prob.mat,
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 463 
##  ------------------------------ 
##  Degrees of freedom = 1 
##  Non-centrality parameter = 7.849 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Option 2: Use Phi coefficient = 0.1302134.
Degrees of freedom is 1 for Phi coefficient.

pwrss.chisq.gofit(w = 0.1302134, df = 1,
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 463 
##  ------------------------------ 
##  Degrees of freedom = 1 
##  Non-centrality parameter = 7.849 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

9.3 Contingency Table: j x k

Effect size: Cramer’s V (or Cohen’s W) for independence test (j x k contingency tables).

How many subjects are needed to detect the relationship between depression severity and gender?
Check the original article at https://doi.org/10.1016/j.jad.2019.11.121

Option 1: Use cell probabilities (from the article).

prob.mat <- cbind(c(0.6759, 0.1559, 0.1281, 0.0323, 0.0078),
                  c(0.6771, 0.1519, 0.1368, 0.0241, 0.0101))
rownames(prob.mat) <- c("Normal", "Mild", "Moderate", "Severe", "Extremely Severe")
colnames(prob.mat) <- c("Female", "Male")
prob.mat

##                  Female   Male
## Normal           0.6759 0.6771
## Mild             0.1559 0.1519
## Moderate         0.1281 0.1368
## Severe           0.0323 0.0241
## Extremely Severe 0.0078 0.0101

pwrss.chisq.gofit(p1 = prob.mat,
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 13069 
##  ------------------------------ 
##  Degrees of freedom = 4 
##  Non-centrality parameter = 11.935 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Option 2: Use Cramer’s V = 0.03022008 based on 5 x 2 contingency table
Degrees of freedom is (nrow - 1) * (ncol - 1) for Cramer’s V.

pwrss.chisq.gofit(w = 0.03022008, df = 4,
                  alpha = 0.05, power = 0.80)

##  Pearson's Chi-square Goodness-of-fit Test 
##  for Contingency Tables 
##  ------------------------------ 
##   Statistical power = 0.8 
##   Total n = 13069 
##  ------------------------------ 
##  Degrees of freedom = 4 
##  Non-centrality parameter = 11.935 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

10 Correlation(s) (z Test)

10.1 One Correlation

One-sided Test: Assume that the expected correlation is 0.20 and it is greater than 0.10. What is the minimum required sample size?

pwrss.z.corr(r = 0.20, r0 = 0.10,
             power = 0.80, alpha = 0.05, 
             alternative = "greater")

##  A Correlation against a Constant (z Test) 
##  H0: r = r0 
##  HA: r > r0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 593 
##  ------------------------------ 
##  Alternative = "greater" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Two-sided Test: Assume that the expected correlation is 0.20 and it is different from 0 (zero). The correlation could be 0.20 as well as -0.20. What is the minimum required sample size?

pwrss.z.corr(r = 0.20, r0 = 0,
             power = 0.80, alpha = 0.05, 
             alternative = "not equal")

##  A Correlation against a Constant (z Test) 
##  H0: r = r0 
##  HA: r != r0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 194 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = 2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

10.2 Difference between Two Correlations (Independent)

Assume that the expected correlations in the first and second groups are 0.30 and 0.20, respectively (r1 = 0.30 and r2 = 0.20).

One-sided Test: Expecting r1 - r2 greater than 0 (zero). The difference could be 0.10 but could not be -0.10. What is the minimum required sample size?

pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
               power = .80, alpha = 0.05, 
               alternative = "greater")

##  Difference between Two Correlations 
##  (Independent Samples z Test) 
##  H0: r1 = r2 
##  HA: r1 > r2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1088 
##   n2 = 1088 
##  ------------------------------ 
##  Alternative = "greater" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Two-sided Test: Expecting r1 - r2 different from 0 (zero). The difference could be -0.10 as well as 0.10. What is the minimum required sample size?

pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
               power = .80, alpha = 0.05, 
               alternative = "not equal")

##  Difference between Two Correlations 
##  (Independent Samples z Test) 
##  H0: r1 = r2 
##  HA: r1 != r2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1380 
##   n2 = 1380 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = 2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

11 Proportion(s) (z Test)

11.1 One Proportion

In the following examples p is the proportion under alternative hypothesis and p0 is the proportion under null hypothesis.

One-sided Test: Expecting p - p0 smaller than 0 (zero).

# normal approximation
pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "less",
             arcsin.trans = FALSE)

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p = p0 
##  HA: p < p0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 613 
##  ------------------------------ 
##  Alternative = "less" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

# arcsine transformation
pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "less",
             arcsin.trans = TRUE)

##  Approach: Arcsine Transformation 
##  A Proportion against a Constant (z Test) 
##  H0: p = p0 
##  HA: p < p0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 617 
##  ------------------------------ 
##  Alternative = "less" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Two-sided Test: Expecting p - p0 smaller than 0 (zero) or greater than 0 (zero).

pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "not equal")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p = p0 
##  HA: p != p0 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 778 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Non-inferiority Test: The case when smaller proportion is better. Expecting p - p0 smaller than 0.01.

pwrss.z.prop(p = 0.45, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "non-inferior")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p - p0 <= margin 
##  HA: p - p0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 426 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Non-inferiority Test: The case when bigger proportion is better. Expecting p - p0 greater than -0.01.

pwrss.z.prop(p = 0.55, p0 = 0.50, margin = -0.01,
             alpha = 0.05, power = 0.80,
             alternative = "non-inferior")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p - p0 <= margin 
##  HA: p - p0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 426 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority Test: The case when smaller proportion is better. Expecting p - p0 smaller than -0.01.

pwrss.z.prop(p = 0.45, p0 = 0.50, margin = -0.01,
             alpha = 0.05, power = 0.80,
             alternative = "superior")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p - p0 <= margin 
##  HA: p - p0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 957 
##  ------------------------------ 
##  Alternative = "superior" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority Test: The case when bigger proportion is better. Expecting p - p0 greater than 0.01.

pwrss.z.prop(p = 0.55, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "superior")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: p - p0 <= margin 
##  HA: p - p0 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 957 
##  ------------------------------ 
##  Alternative = "superior" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Equivalence Test: Expecting p - p0 between -0.01 and 0.01.

pwrss.z.prop(p = 0.50, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "equivalent")

##  Approach: Normal Approximation 
##  A Proportion against a Constant (z Test) 
##  H0: |p - p0| >= margin 
##  HA: |p - p0| < margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n = 21410 
##  ------------------------------ 
##  Alternative = "equivalent" 
##  Non-centrality parameter = -2.926 2.926 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

11.2 Difference between Two Proportions (Independent)

In the following examples p1 and p2 are proportions for the first and second groups under alternative hypothesis. The null hypothesis state p1 = p2 or p1 - p2 = 0.

One-sided Test: Expecting p1 - p2 smaller than 0 (zero).

# normal approximation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "less",
               arcsin.trans = FALSE)

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 = p2 
##  HA: p1 < p2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1231 
##   n2 = 1231 
##  ------------------------------ 
##  Alternative = "less" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

# arcsine transformation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "less",
               arcsin.trans = TRUE)

##  Approach: Arcsine Transformation 
##  Cohen's h = -0.1 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 = p2 
##  HA: p1 < p2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1233 
##   n2 = 1233 
##  ------------------------------ 
##  Alternative = "less" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Two-sided Test: Expecting p1 - p2 smaller than 0 (zero) or greater than 0 (zero).

pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "not equal")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 = p2 
##  HA: p1 != p2 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1562 
##   n2 = 1562 
##  ------------------------------ 
##  Alternative = "not equal" 
##  Non-centrality parameter = -2.802 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Non-inferiority Test: The case when smaller proportion is better. Expecting p1 - p2 smaller than 0.01.

pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "non-inferior")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 - p2 <= margin 
##  HA: p1 - p2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 855 
##   n2 = 855 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Non-inferiority Test: The case when bigger proportion is better. Expecting p1 - p2 greater than -0.01.

pwrss.z.2props(p1 = 0.55, p2 = 0.50,  margin = -0.01,
               alpha = 0.05, power = 0.80,
               alternative = "non-inferior")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 - p2 <= margin 
##  HA: p1 - p2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 855 
##   n2 = 855 
##  ------------------------------ 
##  Alternative = "non-inferior" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority Test: The case when smaller proportion is better. Expecting p1 - p2 smaller than -0.01.

pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = -0.01,
               alpha = 0.05, power = 0.80,
               alternative = "superior")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 - p2 <= margin 
##  HA: p1 - p2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1923 
##   n2 = 1923 
##  ------------------------------ 
##  Alternative = "superior" 
##  Non-centrality parameter = -2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Superiority Test: The case when bigger proportion is better. Expecting p1 - p2 greater than 0.01.

pwrss.z.2props(p1 = 0.55, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "superior")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: p1 - p2 <= margin 
##  HA: p1 - p2 > margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 1923 
##   n2 = 1923 
##  ------------------------------ 
##  Alternative = "superior" 
##  Non-centrality parameter = 2.486 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Equivalence Test: Expecting p1 - p2 between -0.01 and 0.01.

pwrss.z.2props(p1 = 0.50, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "equivalent")

##  Approach: Normal Approximation 
##  Difference between Two Proportions 
##  (Independent Samples z Test) 
##  H0: |p1 - p2| >= margin 
##  HA: |p1 - p2| < margin 
##  ------------------------------ 
##   Statistical power = 0.8 
##   n1 = 42820 
##   n2 = 42820 
##  ------------------------------ 
##  Alternative = "equivalent" 
##  Non-centrality parameter = -2.926 2.926 
##  Type I error rate = 0.05 
##  Type II error rate = 0.2

Please send any bug reports, feedback or questions to bulusmetin [at] gmail.com.

–o–

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License .

A Practical Guide to Statistical Power and Sample Size Calculations in R

Metin Bulus

2023-04-11