Impact of sample size

Implications of the size of the sample on significance testing

In sideways statistics the shape of the probability curves is determined by the sample size†.. (Technically, it's a function of the standard deviation, which itself is a function of the sample size and the sum of variances from the mean. But since the only one of those that you some control over in your experimental design is sample size, we'll treat the others as unvarying.)

The larger the sample size in an analysis the less influence random error has on the sample correlation. Therefore, for larger sample sizes the population probability curve that surrounds the sample correlation will be less spread out. Figure 01 represents how the population probability curve tightens as the sample size increases. The sample size increases from one curve to the next.

Figure 01: Probabilities tighten as N increases

Figure 01Go to full size

In those examples, alpha is constant, and therefore as the probability curve tightens the line that bisects the probability curve drops closer to the null line. In other words, at a constant confidence level, the larger the sample size is, the lower the minimum correlation that is statistically significant becomes.

Figure 02 shows the relation between sample size and the minimum statistically significant correlation (for an analysis with one independent variable, p<.05, and a statistical power of 50%). The relation is what we would expect from the sideways examples—as sample sizes get larger, the minimum statistically significant sample correlation drops.

Figure 02: Relation of minimum statistical significance to sample size

Figure 02

This relations between sample size and the minimum significant correlation leads to one of the major criticisms of null-hypothesis significance testing: the only thing NHST really proves is that a sufficiently large sample was used. Technically, any non-zero correlation can be statistically significant at any confidence level if you increase the sample size large enough.

Sample size also has a similar effect on statistical power. All other things being equal, a study with a larger sample size will have a larger statistical power. Figure 03 shows two examples, along with an overlay of the two. The blue area of the overlay highlights where the statistical power is being increased by the narrowing of the probability curves that is a result of a larger sample size.

Statistical power with smaller N Statistical power with larger N Statistical powers with different sample sizes

Figure 03Go to full size