This is a brief guide to using the `pwr`

^{1} package by way of a few examples.

Recall that for a statistical test the following factors are inter-related:

- The desired level of significance, \(\alpha\) = P(Type I error)
- The power of the test, (1 - \(\beta\)) = 1 - P(Type II error)
- The sample size, \(n\)
- The minimal effect size of interest
- The variance in the response variable

Thus knowing any four factors will provide an estimate for the remaining fifth factor.

In base `R`

the `stats`

package has some functions for calculating power, namely:

`power.t.test()`

,`power.prop.test()`

, and`power.anova.test()`

.

The `pwr`

package includes substitutes for these functions plus a few more.

`pwr`

packageThe `pwr`

package has various functions useful for power calculations. The first four below overlap with the above-mentioned `stats`

package functions:

`pwr.t.test()`

1-, 2-sample, and paired t-test`pwr.t2n.test()`

2-sample t-test`pwr.2p.test()`

2-sample test of proportions (equal size)`pwr.anova.test()`

balanced 1-way ANOVA`pwr.2p2n.test()`

2-sample test of proportions (unequal size)`pwr.p.test()`

1-sample test of proportions`pwr.r.test()`

correlation test`pwr.chisq.test()`

chi-squared goodness of fit or association test`pwr.f2.test()`

test of linear model coefficients

One difference between the base `stats`

and the `pwr`

functions is that the latter generally expects standardised (Cohen^{2}) effect sizes as an argument rather than sample statistics such as proportions, means, or variances.

More detailed documentation on the `pwr`

package can be found in its vignette on CRAN.

Install the `pwr`

package using the `install.packages()`

command:

The model for multiple linear regression is as follows:

\[y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]

The null hypothesis is that none of the \(p\) explanatory variables \(x_i\) explain any of the variability in the response variable \(y\). This would mean their regression coefficients, \(\beta_i\), are all statistically indistinguishable from zero.

The alternative hypothesis is that *at least one* of the coefficients is not equal to zero.

\(H_0: \beta_i = 0, \quad \forall i = 1, 2, \dots, p.\)

\(H_A: \textrm{At least one}\; \beta_i \ne 0,\; \textrm{for}\;i = 1, 2, \dots, p.\)

The `pwr`

function for calculating sample sizes for multiple linear regression is `pwr.f2.test()`

.

```
## function (u = NULL, v = NULL, f2 = NULL, sig.level = 0.05, power = NULL)
## NULL
```

The (numerator) degrees of freedom, \(u\), is the number of coefficients you have in your model.

The (denominator) degrees of freedom, \(v\), is the number of error degrees of freedom \(v = n − u − 1\). Rearranging gives an expression for sample size \(n = v + u + 1\) (always rounding *up* to the next integer).

The effect size \(f^2 = \frac{R^2}{1−R^2}\), where \(R^2\) is the coefficient of determination, otherwise understood as the proportion of variance in the response variable explained by the multiple regression model.

One way to determine the effect size parameter is by first hypothesising an \(R^2\) value, i.e., the proportion of variance that the model will explain.

For example, if we have:

- six explanatory variables,
- \(R^2 = 20\%\) which gives \(f^2 = \frac{0.2}{1 - 0.2} = 0.25\),
- significance level of \(\alpha = 5\%\), and
- power of \(1 - \beta = 0.8\),

then passing these to the `pwr.f2.test()`

function:

```
##
## Multiple regression power calculation
##
## u = 6
## v = 54.09317
## f2 = 0.25
## sig.level = 0.05
## power = 0.8
```

we get a \(v = 55\) (rounding up).

From this we can calculate the sample size: \(n = v + u + 1 = 55 + 6 + 1 = 62\).

Alternatively, Cohen (1982)^{3} suggests that \(f^2\) values of 0.02, 0.15, and 0.35 represent small, medium, and large effect sizes respectively.

These values are conveniently stored in the `pwr`

package and retrieved using the `cohen.ES()`

function:

```
##
## Conventional effect size from Cohen (1982)
##
## test = f2
## size = small
## effect.size = 0.02
```

```
##
## Conventional effect size from Cohen (1982)
##
## test = f2
## size = medium
## effect.size = 0.15
```

```
##
## Conventional effect size from Cohen (1982)
##
## test = f2
## size = large
## effect.size = 0.35
```

Therefore we have:

- six explanatory variables,
- \(f^2 = 0.35\) (large effect size)
- significance level of \(\alpha = 5\%\), and
- power of \(1 - \beta = 0.8\).

```
##
## Multiple regression power calculation
##
## u = 6
## v = 38.62994
## f2 = 0.35
## sig.level = 0.05
## power = 0.8
```

Calculating the sample size \(n = v + u + 1 = 39 + 6 + 1 = 46\), i.e., to achieve a power of 80% and be able to detect a large effect size, a sample size of 46 is needed.

ANOVA where each group has the same number of samples, i.e., *balanced*.

Null hypothesis is that the means of each group are all the same.

Alternative hypothesis is that the mean of at least one group is significantly different.

\(H_0: \mu_1 = \mu_2 = \dots = \mu_k\)

\(H_A: \textrm{at least one}\; \mu_i\; \textrm{is different from the others}\)

```
##
## Conventional effect size from Cohen (1982)
##
## test = anov
## size = small
## effect.size = 0.1
```

Therefore we have

- \(k = 3\) groups,
- \(f = 0.1\) (small effect size)
- significance level of \(\alpha = 5\%\), and
- power of \(1 - \beta = 0.8\).

```
##
## Balanced one-way analysis of variance power calculation
##
## k = 3
## n = 322.157
## f = 0.1
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
```

Therefore to have 80% power and be able to detect a small difference in effects between groups, 323 samples are needed in each group. That makes a total of 969 samples! Large numbers of samples are needed if you want to detect small effects reliably.

For a two-sample t-test where each group has the same number of samples, the null hypothesis is that the means of both groups are all the same.

The alternative hypothesis is that the mean of group 2 is larger.

\(H_0: \mu_1 = \mu_2\)

\(H_A: \mu_1 < \mu_2\)

```
# Looking for a large effect size
pwr.t.test(d = cohen.ES(test = "t", size = "large")$effect.size,
power = 0.80,
sig.level = 0.05,
alternative = "greater")
```

```
##
## Two-sample t test power calculation
##
## n = 20.03277
## d = 0.8
## sig.level = 0.05
## power = 0.8
## alternative = greater
##
## NOTE: n is number in *each* group
```

The required sample size is \(n = 21 \times 2 = 42\).

```
# Compare result with built-in R function
power.t.test(n = 20,
sd = 1,
sig.level = 0.05,
power = 0.8,
alternative = "one")
```

```
##
## Two-sample t test power calculation
##
## n = 20
## delta = 0.8006829
## sd = 1
## sig.level = 0.05
## power = 0.8
## alternative = one.sided
##
## NOTE: n is number in *each* group
```

Working backwards from sample size, we see that the `power.t.test()`

returns a similar effect size estimate as the `pwr`

function.