## Testing for importance using significance

As part of the designing a study that will use linear regression analysis, you should consider what the best hypothesis against which to test for significance would be. You want a hypothesis that fits both the theory you're intending to support or debunk, and that works with the sort of data that you're expecting to have.

The page on this site titled “Insignificance is not failure” discusses why null-hypothesis significance testing is not really the universal litmus test for significance that so many researchers in the humanties treat is as. An example of an alternative test given on that page is an upper-bound significance test, one in which you're attempting to demonstrate that a correlation is too small in effect to be important. This page looks at that sort of analysis in a little more detail.

What would it look like if we designed our experiment from the beginning to check for an upper bound? That is, what if we were trying to demonstrate that, with 95% confidence, a correlation is lower than some value that we have interpreted as the minimum **important** correlation?

First, we'd mark our upper bound—which equates to the minimum important correlaton—on the the vertical baseline of our ‘sideways’ graph of the analysis. Then, the population probability curve on the right side of the graph would be positioned so that 95% of the area between it and the vertical baseline would be below that minimum important correlation. That would mean that line that bisects the population probability curve is therefore the maximum sample correlation that demonstrates that the population correlation does not meet your criteria for importance, with ‘*p*<.05’. See Figure 01.

As with null-hypothesis testing, the sample population curve on the left side of the ‘sideways’ graph is centered on the population correlation we're expecting to see. The farther away from the minimum important correlation that expected size of effect is, the better off we are, as can be seen in Figure 01 larger would be to move the sample probability curve down, away from the upper bound. In this example, the statistical power is a somewhat dismal 16%.

So, arranging things on a sideways graph to help understand an upper-bound significant test is fairly straight-forward. However, when it comes to doing the actual calculation things become somewhat more complicated. More specifically, they are less symmetric.

When considering correlations near zero (which is where null-hypothesis significance testing is most relevant), you can assume that the probability distributions are symmetric about some mean value. But when looking at larger correlations, you can no longer assume that.

This is because correlations can never have a magnitude above 1. So, as the correlation of interest moves higher up the vertical baseline, the probability distribution begins to “pile up” against the maximum value of 1, and this distorts the curve, making it asymmetric.

Figure 02 shows this. The line marking the mean of the curve no longer perfectly bisects the area of the sample correlation probability curve. When one side of the curve is truncated, the probability of the remaining values increase, distorting the distribution of area about the mean. Which means we can no longer assume that 50% of the area is on either side of the correlation that defines the curve.

To do the calculations, you have to correct for this distorting truncation to the probability curves. You can do this by transforming the data; typically you would transform each correlation into its Fisher *z*-score, using the following equation.

What the Fisher transformation does it to effectively remove the upper and lower bounds—at +1 and -1—that are a consequence of the nature of correlations. Figure 03 shows one way of understanding what the transformation does. The perfectly straight diagonal line represents untransformed correlations; it's the line where *r*=*r*. The long curving line is shows the results of Fisher transformations; it's the line where *r*=* z(r)*. Near the origin the two lines are almost identical, but as the correlation magnitude approaches 1, the transformed values split away and extend toward infinity. This effectively removes the truncating bounds, so the probability curves have nothing to “bunch up” against.