Type I and Type II errors

A Type I error is a “false positive”. When the correlation you find in a sample doesn't exist in the population, that's a Type I error.

A Type II error is a “false negative”. When there exists a correlation in a population but you fail to find it in your sample, that's a Type II error.

The following sections use sideways statistics to show both types of errors.

Sideways Type I errors

Figure 14

Sideways Type I errors

Figure 14 shows a linear regression done at a 95% confidence level. The population probability curve on the right side of the graph has 5% of its area bounded by null, and the line that bisects it is the minimum sample correlation that would be statistically significant.

The unusual thing in this graphic is that the sample probability curve on the left side of the graph has been centered on the null line. It is there because that is what is involved in a Type I error—a population correlation of zero.

A Type I error would be any analysis that found a statistically significant sample correlation, despite the fact that no correlation exists in the population. On the sample probability curve, on the left side of the graph, the area under† (where “under” means between the curve and the vertical baseline) the sample probability curve that is higher than the minimum significant correlation, is the chance of a Type I error. That shaded area is alpha, and is 5% of the total area under the sample probability curve.

Symmetry

An important thing to note here is that when the population correlation is zero, the area of the sample probability curve that extends above the line of the minimum statistically significant sample correlation will always be identical to the area of the population probability curve that's below the null line. This is because the two curves are the same shape and in perfect symmetry. So, the area of one curve that is bounded by the bisecting line of the other curve is identical to the area of the other curve that is bounded by its bisecting line.

This symmetry is why lining up the population probability curve with 5% of the area below the zero/null line works for establishing a “95% confidence level”.

Sideways Type II errors

Figure 15

Sideways Type II errors

Figure 15 also shows a linear regression done at a 95% confidence level, but this example has a statistical power of 84%.

A Type II error is when a population correlation exists, but the sample fails to find it. So, on the graph, a Type II error would be any sample correlation that is below the minimum statistically significant correlation. A result for a sample analysis in this region would fail to be statistically significant, despite the population correlation not being null. Therefore, the area under the sample probability curve that is lower than the minimum significant correlation, is the chance of a Type II error. That shaded area is beta, and is 16%† ( 100% − 84% = 16% ) of the total area under the sample probability curve.

Are Type II ‘second-class’ errors?

Type I and Type II errors are at the core of what significance testing is all about. Many researchers in the social sciences seem to consider only Type I errors to be important, but there is no basis for that distinction in the statistics themselves. As is mentioned on the page “Interaction of alpha and beta”, if researchers don't care at all about Type II errors, they can minimize the value of alpha by maximizing the value of beta. Which, in practical application, is typically 0.50, giving a statistical power of 50%. E.g., a large number of the empirical studies in the Academia of Management Review have a statistical power of roughly 50%, evidence that researchers are minimizing the chance of Type I errors at the expense of increasing the chance of a Type II error.

Sideways Type I and Type II errors

A different way to look at both

Sideways Type I errors

Sideways Type I errors

Symmetry

Sideways Type II errors

Sideways Type II errors

Are Type II ‘second-class’ errors?