Sample size justification

This section contains the following:

Introduction
Things to consider when writing a sample size calculation
Illustrative example - Binary outcomes example
Illustrative example - Continuous outcomes example
Illustrative example - Cluster randomised trial example
Additional resources
Further reading

Introduction

The sample size for a trial should be large enough to detect a clinically important difference in the primary outcome(s) with a desired probability. All protocols for pragmatic trials should have a sample size section. The calculation is dependent upon the design of the trial and the type of data of the primary outcome, the desired significance level etc. For example, binary outcomes (such as alive or dead) and continuous outcomes (such as length of hospital stay, weight of patient) require different methods for calculating sample sizes. Similarly, cluster trials require different method to trials which randomly assign individuals. The sample size should be inflated to make allowance for loss of patients during the trial (attrition).

The protocol should indicate how the sample size was determined. If a formal power sample size calculation was used, the authors should base the calculation on the primary outcome(s) (see Outcome measures). All the quantities used in the calculation should be justified and reported, and the resulting target sample size per comparison group should be given.

Things to consider when writing a sample size calculation:

The sample size should be calculated on the primary outcome(s)
The calculation should reflect the type of trial design being used
The calculation should reflect the type of outcome (binary, etc)
The type of data will influence the sample size calculation
The anticipated outcomes in intervention and control groups should be estimated
The significance (alpha) level, often assumed 5%, should be stated
The power level (beta) level, often assumed 80%, should be stated

Scenarios where special considerations apply

Cluster trials

The main consequence of adopting a cluster randomized design is that it has lower statistical power than a patient-randomized trial of equivalent size due to observations within each cluster being correlated. A measure of the extent of the clustering is known as the ‘intracluster correlation coefficient’ (ICC). Because of this, sample sizes require to be inflated to adjust for the clustering effect. Both the ICC and the cluster size influence the inflation required. In general, increasing the cluster size above 50 will give little additional statistical power, whereas increasing the number of clusters is more efficient. Researchers often have to trade off the logistical difficulties and costs associated with recruitment of extra clusters against those associated with increasing the number of patients per cluster. Within a protocol, the additional items to state are:

the assumed ICC
the number of clusters
the average cluster size

Illustrative example - Binary outcomes example
The total minimum sample size was determined to be 8500 women, with half allocated to receive calcium supplements and the other half to receive placebo tablets. This sample size is sufficient to obtain 80% power to detect a 30% reduction in the rate of pre-eclampsia in the calcium group (2.8%), as compared with the placebo group rate of 4%. This rate in the placebo group is based on data obtained from one of the populations considered for the study, with HRP/RHR/WHO supported population based data collection system. This rate is at the lower end of the range from other candidate centres and gives the sample size calculation a conservative approach. An even more conservative estimation is a reduction in the rate of pre-eclampsia from 3.5% in the placebo to 2.45% in the calcium group, which requires a total of 9000 women. A total of 4500 women will be sufficient to show a reduction from 4% to 2.40% or a RR of 0.60. This latter sample size will be used as the milestone for the first interim analysis. (WHO Multicentre Randomized Trial of Calcium Supplementation for the Prevention of Pre-eclamsia - go to protocol)

Illustrative example - Binary outcomes example

The total minimum sample size was determined to be 8500 women, with half allocated to receive calcium supplements and the other half to receive placebo tablets. This sample size is sufficient to obtain 80% power to detect a 30% reduction in the rate of pre-eclampsia in the calcium group (2.8%), as compared with the placebo group rate of 4%. This rate in the placebo group is based on data obtained from one of the populations considered for the study, with HRP/RHR/WHO supported population based data collection system. This rate is at the lower end of the range from other candidate centres and gives the sample size calculation a conservative approach. An even more conservative estimation is a reduction in the rate of pre-eclampsia from 3.5% in the placebo to 2.45% in the calcium group, which requires a total of 9000 women. A total of 4500 women will be sufficient to show a reduction from 4% to 2.40% or a RR of 0.60. This latter sample size will be used as the milestone for the first interim analysis. (WHO Multicentre Randomized Trial of Calcium Supplementation for the Prevention of Pre-eclamsia - go to protocol)

Illustrative example - Continuous outcomes example
To have an 85% chance of detecting as significant (at the two sided 5% level) a five point difference between the two groups in the mean SF-36 general health perception scores, with an assumed standard deviation of 20 and a loss to follow up of 20%, 360 women in each group were required. ( adapted from BMJ. 2000;321:593-8. - go to article (included with permission))

Illustrative example - Cluster trial example
In order to demonstrate a 25% relative reduction, with a power of 80% and a statistical significance level of 5%, in outcome measures between the control and intervention groups, we estimated a need for a sample of around 140 practices in total (Cluster Randomisation Sample Size Calculator ver 1.0.2, Health Services Research Unit, Aberdeen University). We assumed that none of the main outcomes would be less than 50% in the control group and that the average number of patients included per practice would be at least 10 per outcome measure. This was based on sales figures of anti-hypertensive drugs, a survey on the usage of risk assessment tools, and published figures on achievement of treatment goals. The sample size also takes into account the need to adjust for intracluster correlation, which is a consequence of randomising at one level (clinical practices) and analysing at another (patients). The adjusting factor was conservatively estimated to be 0.2, based on data from a previous study. (RaPP trial - go to protocol)

Illustrative example - Cluster trial example

In order to demonstrate a 25% relative reduction, with a power of 80% and a statistical significance level of 5%, in outcome measures between the control and intervention groups, we estimated a need for a sample of around 140 practices in total (Cluster Randomisation Sample Size Calculator ver 1.0.2, Health Services Research Unit, Aberdeen University). We assumed that none of the main outcomes would be less than 50% in the control group and that the average number of patients included per practice would be at least 10 per outcome measure. This was based on sales figures of anti-hypertensive drugs, a survey on the usage of risk assessment tools, and published figures on achievement of treatment goals. The sample size also takes into account the need to adjust for intracluster correlation, which is a consequence of randomising at one level (clinical practices) and analysing at another (patients). The adjusting factor was conservatively estimated to be 0.2, based on data from a previous study. (RaPP trial - go to protocol)

Additional resources

Sample size checklist

This checklist has been contributed by Dave Sackett, who prepared it for the forthcoming 3rd edition of Clinical Epidemiology; A Basic Science for Answering Questions about Health Care, to be published by Lippincott, Williams & Wilkins in 2005.

Sample size calculator (Windows program) and some brief instructions

This program calculates the number of participants required for a patient or cluster randomised trial using either a binary or continuous outcome. Alternatively, you can use four Excel spreadsheets:

Sample size text from 3rd edition of Clinical Epidemiology (General)

This text has been contributed by Dave Sackett, who prepared it for the forthcoming 3rd edition of Clinical Epidemiology; A Basic Science for Answering Questions about Health Care, to be published by Lippincott, Williams & Wilkins in 2005.

Sample size justification

This section contains the following:

Introduction

Things to consider when writing a sample size calculation:

Scenarios where special considerations apply

Cluster trials

Illustrative example - Binary outcomes example

Illustrative example - Continuous outcomes example

Illustrative example - Cluster trial example

Additional resources

Sample size checklist

Sample size calculator (Windows program) and some brief instructions

Sample size text from 3rd edition of Clinical Epidemiology (General)

Trial size text from 3rd edition of Clinical Epidemiology (Large trials and small trials)

Paper nomogram for calculating the number of participants required based on a binary outcome

Paper nomogram for calculating the number of participants required based on a continuous outcome

Database of intracluster correlation coefficients

and a website

Sample size calculator for cluster randomised trials

EPI Info

Further reading

This page was last updated 13th October 2006.

Sample size justification

This section contains the following:

Introduction

Things to consider when writing a sample size calculation:

Scenarios where special considerations apply

Cluster trials

Illustrative example - Binary outcomes example

Illustrative example - Continuous outcomes example

Illustrative example - Cluster trial example

Additional resources

Sample size checklist

Sample size calculator (Windows program) and some brief instructions

Sample size text from 3rd edition of Clinical Epidemiology (General)

Trial size text from 3rd edition of Clinical Epidemiology (Large trials and small trials)

Paper nomogram for calculating the number of participants required based on a binary outcome

Paper nomogram for calculating the number of participants required based on a continuous outcome

Database of intracluster correlation coefficients and a website

Sample size calculator for cluster randomised trials

EPI Info

Further reading

This page was last updated 13th October 2006.

Database of intracluster correlation coefficients

and a website