Sample Size Determination: A Practical Guide For Health Researchers
Sample Size Determination: A Practical Guide For Health Researchers
Sample Size Determination: A Practical Guide For Health Researchers
DOI: 10.1002/jgf2.600
REVIEW ARTICLE
1
College of Medicine, King Saud bin
Abdulaziz University for Health Sciences, Abstract
Jeddah, Saudi Arabia
Although sample size calculations play an essential role in health research, published
2
King Abdullah International Medical
Research Centre, Jeddah, Saudi Arabia
research often fails to report sample size selection. This study aims to explain the
importance of sample size calculation and to provide considerations for determining
Correspondence
Alaa Althubaiti, PhD, College of Medicine,
sample size in a simplified manner. Approaches to sample size calculation according to
King Saud bin Abdulaziz University study design are presented with examples in health research. For sample size estima-
for Health Sciences, King Abdullah
International Medical Research Centre,
tion, researchers need to (1) provide information regarding the statistical analysis to
Mail Code 6656, P.O. Box 9515, Jeddah be applied, (2) determine acceptable precision levels, (3) decide on study power, (4)
21423, Saudi Arabia.
Email: thubaitia@ksau-hs.edu.sa
specify the confidence level, and (5) determine the magnitude of practical significance
differences (effect size). Most importantly, research team members need to engage
in an open and realistic dialog on the appropriateness of the calculated sample size
for the research question(s), available data records, research timeline, and cost. This
study aims to further inform researchers and health practitioners interested in quan-
titative research, so as to improve their knowledge of sample size calculation.
KEYWORDS
effect size, power, regression analysis, sample size, study design
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2022 The Author. Journal of General and Family Medicine published by John Wiley & Sons Australia, Ltd on behalf of Japan Primary Care Association.
72 |
wileyonlinelibrary.com/journal/jgf2 J Gen Fam Med. 2023;24:72–78.
ALTHUBAITI | 73
are beyond the scope of this review. Instead, challenges relating to 2.3 | Statistical analysis to be used is important in
sample size calculations in health research are summarized. sample size calculation
The remainder of this paper is organized as follows. In Section 2,
some important terms are presented. Sections 3–6 discuss sample Predominantly, the sample size should be determined based on sta-
size calculations according to various types of study designs. Finally, tistical analysis. 2,21,22 The type of analysis should be closely related
Section 7 offers some general recommendations. to the study design, study objective, research question(s), or primary
research outcome. Most sample size calculation software packages
include the option to select the required statistical test related to
2 | SA M PLE S IZE: W H AT TO the response or outcome variable(s), with each test requiring a dif-
U N D E R S TA N D? ferent sample size. Therefore, if a comparison between two or more
groups is required after estimating the frequency of a certain attrib-
Sample size calculation involves several statistical terms, a selec- ute in the population, the calculated sample size should be adjusted,
tion of which is provided below in Table Table S1. In the following in order to account for the types of statistical tests to be used in the
sections, the basic concepts are discussed, and detailed guidance is comparison. This ensures that the final sample size is appropriately
provided for sample size calculation. suited to the study's main objective(s) or hypotheses.
2.1 | Expectations regarding sample size 2.4 | When possible, determine the effect size
A sample size can be small, especially when investigating rare dis- In studies examining the effect of an intervention/exposure or the
eases or when the sampling technique is complicated and costly.4,7 difference(s) between two or more groups, the effect size must first
8
Most academic journals do not place limitations on sample sizes. be determined, in order to calculate an appropriate sample size. The
However, an insufficiently small sample size makes it challenging to effect size is defined as the minimum effect an intervention must
reproduce the results and may produce high false negatives, which have in order to be considered clinically or practically significant. 23
in turn undermine the scientific impact of the research. On the other This is considered the most challenging step in sample size calcula-
hand, choosing to enlarge the sample size may be ethically unac- tion. When the effect is small, identifying it and reaching an accept-
ceptable, particularly in Phase 1 studies, where human subjects are able level of power requires a large sample. When the effect is large,
exposed to risks. Moreover, a very large sample size may lead to it is easily identifiable; hence, a smaller sample size is sufficient.
p-values less than the significance level even if the effect is not of The size effect is mostly determined by experience or judg-
9
practical or clinical importance (i.e., false positives). Hence, sample ment. 24 It can also be estimated from previously implemented, well-
size calculation is important for striking a balance between risk and designed studies (such as meta-analyze; see, for example, Thalheimer
benefit.10 Researchers' focus should not be on producing large sam- and Cook 25 for a simplified illustration on how to determine effect
ple sizes. Instead, the focus should be on choosing an appropriately size from published research). An initial pilot study may determine
sized sample that achieves sufficient power so that statistical testing the effect size for start-up studies if accompanied by conversations
detects true positives, comprehensively reporting the analysis tech- with experts in the field that provide useful information on ade-
niques and interpreting the results in terms of p-values, effect size, quate value for the effect size. In a pilot investigation, sample size
and confidence intervals.8,11 calculation may not be required for the pilot sample. 26 An important
approach worth considering here involves enrolling pilot study par-
ticipants based on the inclusion and exclusion criteria of the planned
2.2 | Sample size calculation using larger study and then testing the feasibility of the methods. 27,28
software programs Various solutions have been proposed for cases where effect
sizes cannot be determined. Cohen29 recommends using small, me-
Sample size calculation need not be done manually, and there are dium, and large effect sizes instead of specific values (i.e., standard-
several free-of-charge software tools that can assist in the calcula- ized or unit-free effect size). For example, when the mean difference
tion. For example, OpenEpi12 (an open-source online calculator) and between two groups is of interest, and independent samples t-test is
13
G*Power (a statistical software package) are commonly used for to be used, the standardized effect size is calculated as:
sample size calculations. Wang and Ji14 provide an online calcula-
difference between two means
tor for common studies in health research. PS Power and Sample Standardized effect size =
15 16
standared deviation of response
Size Calculation or Sample Size Calculator are practical tools for
power and sample size calculations in studies with dichotomous,
continuous, or survival outcome measures. The support offered by The difference between the two means is the difference in prac-
these tools varies in terms of the type of interface and the math- tical importance, and the standard deviation of the response is often
ematical formula or assumptions used for calculation.17–20 estimated from similar previous studies.
74 | ALTHUBAITI
Figure 1 illustrates that a sample size can be based on a range of 5% and 25% (allowing MoE of 10% on both sides). The standard de-
standardized effect sizes and powers (e.g., d = 0.2 [small], 0.5 [me- viation (SD) and the estimate of proportion can be obtained from
dium], or 0.8 [large]). If the aim is to compare the mean difference previous studies. If no information regarding the SD is available, re-
between two groups, and an effect size of 0.5 (medium) is used, the searchers can collect a pilot sample to estimate the value of SD, and
total sample size required to reach a power of 80% is 128 partici- use range ∕ n, where n is the number of observations in the pilot
√
pants. Hence, 64 participants are included in each group. study.34 If that proportion is unknown, it is best to use a proportion
By contrast, for a small effect, the total sample size required is close to what is expected; otherwise, a value of 0.5 is assumed to
788 patients, with 394 patients in each group. The use of such arbi- give a sufficiently large sample size.35 However, this value is appro-
30
trary values is common in sample size calculations. Although these priate if the actual population proportion is between 10% and 90%;
values do not admit biological explanations, they are considered to otherwise (for example, in the case of rare or common disease), cau-
have meaningful effects in most comparisons. However, researchers tion should be taken when substituting the proportion, as a signifi-
must note that these are arbitrary values, and must use their judg- cantly larger sample size is required.36
ment to assess whether these values are acceptable in their field of Note that population size is not needed as an input in most
24,31
research. sample size calculations. The population can be defined by various
elements, such as geographical, time frame, or social aspects. For
example, if the prevalence of infection in a hospital's intensive care
3 | SA M PLE S IZE S FO R D E S C R I P TI V E unit (ICU) department during the period between 2005 and 2012 is
STUDIES to be estimated, then the population is all the patients admitted to
ICU during that period. In most studies, we aim to generalize the re-
A descriptive study is “concerned with and designed only to describe sults to a larger population, although we are restricted to observing
the existing distribution of variables, without regard to causal or a specific population. Therefore, when estimating the sample size,
32
other hypothesis.” Such studies include case reports, case series, population size is rarely important in medical research.37 However,
33
and cross-sectional (prevalence) studies. In the latter, the objective if the population is limited (e.g., in a study that evaluates an aca-
is to describe a health phenomenon in a population at a particular demic program, where the population is all students enrolled in the
point in time. The main parameter of interest is proportion\preva- program), then the sample size equations can be adjusted for the
lence, where the variable of interest is a categorical variable. In de- population size.37–39 The size of a finite population can be obtained
scriptive studies, the research could also be interested in the mean, from a database or records, or based on experience in the field, and
where the variable of interest is a continuous response variable. For is included in the sample size calculation.
example, studies estimating the average age of children with asthma
visiting the emergency room in a given year, or the prevalence of
hyponatremia among the elderly in a tertiary care center, are de- 4 | SA M PLE S IZE FO R S T U D I E S
scriptive in nature. The steps for sample size calculation in such de- CO M PA R I N G T WO G RO U P S
scriptive studies are provided in Figure 2.
For example, a 95% confidence level indicates that the sample There are two main types of study in health research: observational
mean will not differ by more than a certain value from the true pop- and experimental. An important distinction between the two is that,
ulation mean in 95% of the repeatedly withdrawn samples from the in an observational study, the researcher does not impose any in-
same population. The margin of error (MoE) is a measure of the pre- tervention and observes only to assess a current condition. In ex-
cision of an estimate. The smaller the allowed MoE, the larger the perimental studies, an intervention is performed/conducted, and its
precision of our estimates and the larger the sample size. Note that results are observed. When the aim is to compare two groups (inter-
the confidence interval = estimate of value of interest ± MoE. For vention/control), the number of study participants should be equally
example, if the prevalence of burnout is 15% in a sample of resi- divided between both groups, so as to attain the maximum power
dents, then, for the larger population, it is estimated to be between for the given sample size. Note, however, that this point is limited to
interventional studies and does not apply to observational studies approach—or sample matching—might well be applied to minimize
(prospective vs. retrospective). The minimum sample size per group the selection bias often associated with nonprobability sampling.52
must be calculated based on the statistical test used. However, in This is particularly useful if the hypothesis states that the main out-
some fields of study, such as pharmacology or biological research, come of interest differs based on specific factors or exposure, such
a minimum of five per group is recommended and considered ac- as gender or age group. The use of replication research studies to
ceptable by academic journals in the field.4 Recommendations for validate the results of nonprobability sampling is also encouraged
minimum sample sizes for clinical studies suggest having at least 100 as a strategy for ensuring generalizability. 53 The methods section of
in each group.40 However, recent advances in sample size calculation a manuscript should include the number of subjects invited to par-
have challenged these recommendations and have investigated the ticipate or the size of target population (if known) and the number
potential of simulation-based methods.41,42 of participants instead of an actual sample size calculation.49 For a
Dividing participants equally between both groups might not review on the inferential data analysis methods for nonprobability
be possible, for several reasons, e.g., costs or limited data on the sampling, see Buelens, Burger, and van den Brakel,46 who applied
treatment group in retrospective studies. In such cases, uneven machine learning methods in order to enhance the representative-
groups are the best option at hand where the researcher will opt to ness of the beforementioned sampling.
increase the sample in one group (e.g., control) with available data.43
Attention should be paid to the statistical data analysis to be used44
and the method for reporting results. p-values are generally large 6 | SA M PLE S IZE C A LCU L ATI O N FO R
(above 0.05) in such cases,45 so reporting effect sizes29 and mean or R EG R E S S I O N A N A LYS I S
median with confidence intervals can be more effective in convey-
ing the practical importance of the results. All in all, increasing the Correlation or regression analysis is used in studies aiming to ex-
sample size increases the precision of estimates, so it is important to amine associations between a set of independent variables and a
report these measures. response variable. Failing to include an appropriate number of ob-
servations leads to an insufficient sample size, in which case regres-
sion might overfit the data.54 This means that, while the results may
5 | PRO BA B I LIT Y A N D N O N PRO BA B I LIT Y be valid for the study's dataset, they cannot be generalized to the
S A M PLI N G population. In addition, estimates of regression coefficients are
likely to be biased from true values, and the confidence intervals
There are two types of sampling methods in research: probability are large.1,11 All these factors adversely affect statistical power. For
(random) and nonprobability (nonrandom). In a probability sample, regression analysis, several theories on sample size calculation have
each unit has a known chance or probability of being selected. By been provided in the literature regarding the use of logistic or linear
contrast, in nonprobability sampling, units are withdrawn or chosen regression for data fitting.55–57
without specific probabilities. Probability sampling includes sim- The number of predictors is important for sample size calculation
ple random sampling, systematic sampling, and stratified sampling. in regression analysis. A larger sample size is required for a higher
Nonprobability sampling includes convenience sampling and quota number of predictors. In cases where interaction terms have more
sampling. than two predictors, the number of interaction terms and the degree
Probability sampling has the advantages of higher generalizabil- of interaction can become large. When the sample size is not large
ity, greater representativeness of the population, and lower response enough to conduct a similar regression analysis, one might add only
bias than nonprobability sampling.46 However, nonprobability sam- important interaction terms with a large effect or use practical judg-
pling is the most commonly adopted type of sampling in clinical stud- ment to form the interaction terms.
ies, survey statistics, and social research, due to its low-to no-cost Another important element in sample size calculation is the
or for ethical reasons.47–50 While calculating a sample size is import- R-squared, defined as the measure of the strength of association
ant for the generalizability of results, estimating a sample size when between the regression model and the response; it is also defined
using nonprobability sampling could be irrelevant, as convenience as the proportion of the variance in the response that is explained
sampling is likely to generate nongeneralizable results, which pre- collectively by the independent variables. 58 Calculating the sample
clude statistical inference to the larger population. As an alterna- size required for multiple regression analysis is equivalent to ascer-
tive, researchers should include as many subjects as possible51 from taining the number of subjects to be enrolled to produce an accept-
the different subgroups and demographics. The quota sampling able R-squared or goodness-of-fit. Multiple regression analysis aims
76 | ALTHUBAITI
to determine whether a variable is significantly associated with the The value of the response rate is often derived from experience
outcome after controlling for all the other predictors. For purposes or previous research. For example, to estimate the proportion of
of estimating the effect size in multiple regressions of each variable, burnout in staff residents in a regional hospital, consider a sample
an assumption is made regarding the value of the R-squared, because with 15% burnout. Allowing for an MoE of 5% and a confidence level
the exact estimates of regression coefficients of these variables are of 95%, the minimum sample size is 195.9. The recommended sample
2
unknown. It is then possible to calculate Cohen's f effect size, which size can be set at 245, so as to allow for a 20% nonresponse rate.
is defined as the ratio of the proportion of variance accounted for Note that a large nonresponse rate is assumed here, as the popula-
relative to the proportion of a variable unaccounted for, where f2 is tion involves physicians.63
2
classified as small, medium, or large (f = 0.02, 0.15 or 0.35, respec-
tively) effect sizes. 29
Calculating sample size on the assumption that regression anal- 7.2 | Avoid unrealistically large samples
ysis is to be used is not practical in many cases. For example, in any
study, there may be more than one multiple regression model, and For start-up studies or studies where no previously established liter-
estimating the sample size for each model is not practical. Although ature is available, we recommend opting for medium to large effect
it is common practice to estimate a sample size sufficient to estimate sizes and not setting sample sizes based on the minimum effect that
the minimum effect size, a minimum effect size might not be identi- would be of practical significance. The results of such studies can
fiable in some cases. Hence, researchers have often relied on “rules- provide insights and useful information for future meta-analyses.
of-thumb” to determine approximate sample sizes. For example, one This also applies if the research is an undergraduate project with
of the considered rules-of-thumb calls for 10 observations per vari- limited resources.
able.59 In addition, the sample size should be larger than the number For example, a researcher comparing the incidence of a certain
of predictors, or else the regression coefficient cannot be estimated. outcome between two independent groups might initially be inter-
How much larger the sample size needs to be is an issue of debate ested in serious complications in patients exposed to two distinct
and depends on the field of study, e.g., biological or social research. surgical treatments. However, if this number is very small, a large
Green60 challenges most of the commonly used rules and argues for sample size will be required. If resources allow, the researcher should
an approach that considers the effect sizes. While he has provided perhaps investigate whether there is a sufficiently large number of
some support for the latter, he also argues that it is not appropriate surgeries in the current hospital; if there is not, it may be advisable
when dealing with seven or more model predictors, though it is suit- to cover more centers. Alternatively, these researchers could alter
able when there is a medium-sized association between the response their research question so that it is concerned with the incidence of
and predictors. More recent proposals in sample size determination any complications following the procedure, and not limited to seri-
7,59
reportedly overcome the design or practical challenges in the field. ous complications. Hence, the required sample size would be smaller
and more feasible. In short, researchers should always look at the
sample size and judge whether it is reasonable and suited to their
7 | G E N E R A L R ECO M M E N DATI O N S research question(s).
PAT I E N T C O N S E N T S TAT E M E N T 14. Wang X, Ji X. Sample size estimation in clinical research: from
randomized controlled trials to observational studies. Chest.
None.
2020;158(1):S12–20. https://doi.org/10.1016/J.CHEST.2020.03.010
15. Dupont WD, Plummer WD. PS: Power and Sample Size Calculations.
C L I N I C A L T R I A L R EG I S T R AT I O N Available from: https://biostat.app.vumc.org/wiki/Main/Power
None. SampleSize. Accessed 5 Nov 2022.
16. Nagashima K. Sample Size Calculator. Available from: https://nshi.
jp/en/js/. Accessed 5 Nov 2022.
C O N FL I C T O F I N T E R E S T 17. Dupont WD, Plummer WD. Power and sample size calculations: a
The authors have stated explicitly that there are no conflicts of inter- review and computer program. Control Clin Trials. 1990;11(2):116–
est in connection with this article. 28. https://doi.org/10.1016/0197-2456(90)90005-M
18. Dattalo P. A review of software for sample size determination. Eval
Health Prof. 2009;32(3):229–48. https://doi.org/10.1177/01632
DATA AVA I L A B I L I T Y S TAT E M E N T 78709338556
Data sharing not applicable—no new data generated. 19. Landau S, Stahl D. Sample size and power calculations for medi-
cal studies by simulation when closed form expressions are not
available. Stat Methods Med Res. 2012;22(3):324–45. https://doi.
ORCID
org/10.1177/096228 0212439578
Alaa Althubaiti https://orcid.org/0000-0002-4175-1703
20. Kumar A, Dogra S, Kaur A, Modi M, Thakur A, Saluja S. Approach
to sample size calculation in medical research. Curr Med Res Pract.
2014;4(2):87–92. https://doi.org/10.1016/J.CMRP.2014.04.001
REFERENCES
21. Quinn GP, Keough MJ. Experimental design and data analysis for
1. Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson biologists. Cambridge: Cambridge University Press; 2002.
ESJ, et al. Power failure: why small sample size undermines the 22. van Belle G. Statistical rules of thumb. 2nd ed. Hoboken, NJ: Wiley;
reliability of neuroscience. Nat Rev Neurosci. 2013;14(5):365–76. 2008. https://doi.org/10.1002/9780470377963
https://doi.org/10.1038/nrn3475 23. Kraemer HC, Mintz J, Noda A, Tinklenberg J, Yesavage JA. Caution
2. Wilson Vanvoorhis CR, Morgan BL. Understanding power and regarding the use of pilot studies to guide power calculations for
rules of thumb for determining sample sizes. Tutor Quant Methods study proposals. Arch Gen Psychiatry. 2006;63(5):484–9. https://
Psychol. 2007;3(2):43–50. doi.org/10.1001/ARCHPSYC.63.5.484
3. Serdar CC, Cihan M, Yücel D, Serdar MA. Sample size, power and 24. Lenth RV. Some practical guidelines for effective sample size deter-
effect size revisited: simplified and practical approaches in pre- mination. Am Stat. 2001;55(3):1–11. https://doi.org/10.1198/00031
clinical, clinical and laboratory studies. Biochem Med. 2021;31(1):1– 3001317098149
27. https://doi.org/10.11613/BM.2021.010502 25. Thalheimer W, Cook SR. How to calculate effect sizes from pub-
4. Curtis MJ, Bond RA, Spina D, Ahluwalia A, Alexander SPA, lished research: a simplified methodology. Work-Learning Research
Giembycz MA, et al. Experimental design and analysis and their Published 2002. Available from: http://work-learning.com/effect_
reporting: new guidance for publication in BJP. Br J Pharmacol. sizes.htm. Accessed 28 Sep 2021.
2015;172(14):3461–71. https://doi.org/10.1111/BPH.12856 26. Jones SR, Carley S, Harrison M. An introduction to power and sam-
5. American Psychological Association. Publication manual of the ple size estimation. Emerg Med J. 2003;20(5):453–8. https://doi.
American Psychological Association. 7th ed. Washington, DC: org/10.1136/EMJ.20.5.453
American Psychological Association; 2020. 27. Hertzog MA. Considerations in determining sample size for
6. Bartlett JE, Kotrlik JW, Higgins CC. Organizational research: de- pilot studies. Res Nurs Health. 2008;31(2):180–91. https://doi.
termining appropriate sample size in survey research. Inf Technol org/10.1002/NUR.20247
Learn Perform J. 2001;19(1):43–50. 28. Westlund E, Stuart EA. The nonuse, misuse, and proper use of
7. Jenkins DG, Quintana-A scencio PF. A solution to minimum sample pilot studies in experimental evaluation research. Am J Eval.
size for regressions. PLoS One. 2020;15(2):e0229345. https://doi. 2017;38(2):246–61. https://doi.org/10.1177/1098214016651489
org/10.1371/JOURNAL.PONE.0229345 29. Cohen J. Statistical power analysis for the Behavioural sciences.
8. Bacchetti P, Deeks SG, McCune JM. Breaking free of sample size 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
dogma to perform innovative translational research. Sci Transl 30. Sullivan GM, Feinn R. Using effect size—or why the P value is
Med. 2011;3(87):87ps24. https://doi.org/10.1126/SCITRANSLM not enough. J Grad Med Educ. 2012;4(3):279–82. https://doi.
ED.3001628 org/10.4300/JGME-D-12-0 0156.1
9. Lindley DV. A statistical paradox. Biometrika. 1957;44(1/2):187–92. 31. Norman G, Monteiro S, Salama S. Sample size calculations: should
https://doi.org/10.2307/2333251 the emperor's clothes be off the peg or made to measure? BMJ.
10. Bacchetti P, Wolf LE, Segal MR, McCulloch CE. Ethics and sample 2012;345:g5341. https://doi.org/10.1136/BMJ.E5278
size. Am J Epidemiol. 2005;161(2):105–10. https://doi.org/10.1093/ 32. Porta M. A dictionary of epidemiology. 6th ed. Oxford: Oxford
AJE/KWI014 University Press; 2014.
11. Ioannidis JPA. Why most published research findings are false. PLoS 33. Grimes DA, Schulz KF. Descriptive studies: what they can and can-
Med. 2005;2(8):e124. https://doi.org/10.1371/JOURNAL.PMED. not do. Lancet. 2002;359:145–9. https://doi.org/10.1016/S0140
0020124 -6736(02)07373-7
12. Dean A, Sullivan K, Soe M. OpenEpi: open source epidemio- 34. Mantel N. Rapid estimation of standard errors of means for small
logic statistics for public health. Published 2013. Available from: samples. Am Stat. 1951;5(4):26–7. https://doi.org/10.1080/00031
https://www.openepi.com/Menu/OE_Menu.htm. Accessed 27 305.1951.10501120
Aug 2022. 35. Macfarlane S. Conducting a descriptive survey: 2. Choosing a sam-
13. Faul F, Erdfelder E, Lang A-G , Buchner A. G*power 3: a flexible sta- pling strategy. Trop Doct. 1997;27(1):14–21. https://doi.org/10.1177/
tistical power analysis program for the social, behavioral, and bio- 004947559702700108
medical sciences. Behav Res Methods. 2007;39(2):175–91. https:// 36. Naing L, Winn T, Rusli BN. Practical issues in calculating the sample
doi.org/10.3758/BF03193146 size for prevalence studies. Arch Orofac Sci. 2006;1:9–14.
78 | ALTHUBAITI
37. Kasiulevicius V, Sapoka V, Filipaviciute R. Sample size calculation in theory and practice. Int J Advert. 2017;37(4):650–63. https://doi.
epidemiological studies. Gerontology. 2006;7:225–31. org/10.1080/02650487.2017.1348329
38. Yamane T. Statistics: an introductory analysis. 2nd ed. New York, 54. Babyak MA. What you see may not be what you get: a brief, non-
NY: Harper and Row; 1967. technical introduction to overfitting in regression- t ype models.
39. Noble RB, Bailer AJ, Kunkel SR, Straker JK. Sample size require- Psychosom Med. 2004;66(3):411–21. https://doi.org/10.1097/01.
ments for studying small populations in gerontology research. PSY.0000127692.23278.A9
Health Serv Outcomes Res Method. 2006;6(1–2):59–67. https:// 55. Hsieh FY, Bloch DA, Larsen MD. A simple method of sample size cal-
doi.org/10.1007/S10742-0 06-0 001-4/FIGURES/4 culation for linear and logistic regression. Stat Med. 1998;17:1623–
40. Steyerberg EW. Clinical prediction models: a practical approach to 34. https://doi.org/10.1002/(SICI)1097-0258(19980730)17:14
development, validation, and updating. New York, NY: Springer US; 56. Dupont WD, Plummer WD. Power and sample size calculations
2009. https://doi.org/10.1007/978-0 -387-77244-8 for studies involving linear regression. Control Clin Trials. 1998;
41. Collins GS, Ogundimu EO, Altman DG. Sample size considerations 19(6):589–601. https://doi.org/10.1016/S0197-2456(98)0 0037-3
for the external validation of a multivariable prognostic model: 57. Maxwell SE. Sample size and multiple regression analysis.
a resampling study. Stat Med. 2016;35(2):214– 26. https://doi. Psychol Methods. 2000;5(4):434– 58. https://doi.org/10.1037/
org/10.1002/SIM.6787 1082-989X.5.4.434
42. Snell KIE, Archer L, Ensor J, Bonnett LJ, Debray TPA, Phillips B, et al. 58. Daniel WW, Cross CL. Biostatistics: a foundation for analysis in the
External validation of clinical prediction models: simulation-based health sciences. Hoboken, NJ: Wiley; 1999.
sample size calculations were more reliable than rules-of-thumb. J 59. Riley RD, Ensor J, Snell KIE, Harrell FE Jr, Martin GP, Reitsma JB,
Clin Epidemiol. 2021;135:79–89. https://doi.org/10.1016/J.JCLIN et al. Calculating the sample size required for developing a clinical
EPI.2021.02.011 prediction model. BMJ. 2020;368:m441. https://doi.org/10.1136/
43. Singh AS, Masuku M, Masuku MB. Sampling techniques and deter- BMJ.M441
mination of sample size in applied statistics research: an overview. 60. Green SB. How many subjects does it take to do a regression anal-
Int J Econ Commer Manag. 2014;2(11):1–22. ysis. Multivariate Behav Res. 1991;26(3):499–510. https://doi.
44. Riniolo TC. Using a large control Group for Statistical Comparison: org/10.1207/S15327906MBR2603_7
evaluation of a between groups median test. J Exp Educ. 1999; 61. Bath P. Calculation of sample size for stroke trials assessing func-
68(1):75–88. tional outcome: comparison of binary and ordinal approaches:
45. Kim TK, Park JH. More about the basic assumptions of t-test: nor- the optimising analysis of stroke trials (OAST) collaboration. Int
mality and sample size. Korean J Anesthesiol. 2019;72(4):331–5. J Stroke. 2008;3(2):78– 8 4. https://doi.org/10.1111/J.1747-4949.
https://doi.org/10.4097/KJA.D.18.00292 2008.00184.X
46. Buelens B, Burger J, van den Brakel JA. Comparing inference meth- 62. Moore CM, MaWhinney S, Forster JE, Carlson NE, Allshouse
ods for non-probability samples. Int Stat Rev. 2018;86(2):322–43. A, Wang X, et al. Accounting for dropout reason in longitudi-
https://doi.org/10.1111/INSR.12253 nal studies with nonignorable dropout. Stat Methods Med Res.
47. Elfil M, Negida A. Sampling methods in clinical research; an edu- 2015;26(4):1854–66. https://doi.org/10.1177/0962280215590432
cational review. Emergency. 2017;5(1):52. https://doi.org/10.1136/ 63. Cunningham CT, Quan H, Hemmelgarn B, Noseworthy T, Beck CA,
eb-2014 Dixon E, et al. Exploring physician specialist response rates to web-
48. Jager J, Putnick DL, Bornstein MH. More than just convenient: the based surveys. BMC Med Res Methodol. 2015;15(1):1–8. https://
scientific merits of homogeneous convenience samples. Monogr doi.org/10.1186/S12874- 015- 0 016-Z
Soc Res Child Dev. 2017;82(2):13– 3 0. https://doi.org/10.1111/
MONO.12296
49. Stratton SJ. Population research: convenience sampling strategies. S U P P O R T I N G I N FO R M AT I O N
Prehosp Disaster Med. 2021;36(4):373–4. https://doi.org/10.1017/ Additional supporting information can be found online in the
S1049023X21000649
Supporting Information section at the end of this article.
50. Yang S, Kim JK. Statistical data integration in survey sampling: a re-
view. Jpn J Stat Data Sci. 2022;3:625–50. https://doi.org/10.1007/
s42081-020-0 0093-w
51. Guo S, Hussey DL. Nonprobability sampling in social work research.
J Soc Serv Res. 2004;30(3):1–18. https://doi.org/10.1300/J079v How to cite this article: Althubaiti A. Sample size
30n03_01 determination: A practical guide for health researchers. J Gen
52. Baker R, Brick JM, Bates NA, Battaglia M, Couper MP, Dever
Fam Med. 2023;24:72–78. https://doi.org/10.1002/jgf2.600
JA, et al. Report of the AAPOR Task Force on Non-Probability
Sampling; 2013. Available from: https://www.aapor.org/Educa
tion-Resources/Report s/Non-Probability-Sampling.aspx
53. Sarstedt M, Bengart P, Shaltoni AM, Lehmann S. The use of
sampling methods in advertising research: a gap between