Probabilities of Clinical or Practical Significance

SPORTSCIENCE · sportsci.org
News & Comment / Research Resources

Will G Hopkins

Sportscience 6, sportsci.org/jour/0201/wghprob.htm, 2002 (638 words)
Sports Studies, Auckland University of Technology, Auckland 1020, New Zealand. Email. Reviewer: Alan M Batterham, Department of Sport and Exercise Science, University of Bath, Bath BA2 7AY, UK.

Probabilities that the true value of an effect is beneficial, trivial and harmful are more meaningful than the traditional P value. The meaning is enhanced by expressing the probabilities in qualitative terms such as unlikely, almost certainly, and so on. I present here a table for assigning such terms to probabilities, and a link to a slide show on statistical vs clinical or practical significance. KEYWORDS: effect magnitude, P value, statistical significance. Reprint pdf · Reprint doc · Slide show · Reviewer's Comment · Updates

In a short item in the previous issue of Sportscience I argued that the use of P values and statistical significance prevents publication of good research. I presented an alternative approach for assessing research, based on probabilities that the true value of an effect is clinically beneficial, trivial, and harmful. I also provided a link to a spreadsheet for calculation of these probabilities, and there is a page on these and related concepts in A New View of Statistics.

I have now extended the concept by assigning what I consider to be reasonable thresholds for plain-language descriptions of the probabilities. For example, if the effect you have studied turns out to have a probability of 0.80 of being beneficial, you would describe it as likely to be beneficial, or probably beneficial. The same effect might have probabilities of 0.16 of being trivial and 0.04 of being harmful, in which case you would say that the effect is unlikely to be trivial and very unlikely to be harmful. You'd make these qualitative assessments in the Discussion section of a paper or thesis, whereas the Results section would contain a more neutral statement, such as: the chances that the effect is beneficial/trivial/harmful are 80/16/4%. Here's the full schema for describing the probabilities, which I also show as chances and odds:

This table is part of a Powerpoint slide show (link below) that I am using for a seminar with the title Statistical vs Clinical or Practical Significance. The presentation includes the following points:

• An outline of the meaning and shortcomings of hypothesis testing, P values and statistical significance.

• The meaning and need for likely (confidence) limits to convey precision of estimation.

• Definition of the probabilities that an effect is clinically or practically beneficial, trivial, and harmful.

• The above table for interpreting the probabilities.

• Examples of statistically significant and statistically non-significant effects interpreted in a more meaningful and publication-worthy fashion using probabilities of clinical or practical significance.

I finish the presentation with the following summary of advice for reporting your research…

• Show the observed magnitude of the effect.

• Attend to precision of estimation by showing likely limits of the true value.

• Show the P value if you must, but do not test a null hypothesis and do not mention statistical significance.

• Attend to clinical or practical significance by stating the smallest clinically beneficial and/or harmful value then showing the probabilities that the true effect is beneficial, trivial, and harmful.

• Make a qualitative statement about the clinical or practical significance of the effect, using unlikely, almost certainly, and so on.

As far as the likely limits are concerned, 95% is definitely too high to convey precision of estimation. I now recommend 50%, which should be called possible limits, in accordance with the above table of probabilities. I doubt whether they will come into widespread or any use during my lifetime.

Reviewer's Comment

Updated Dec 12, 2006. Minor changes to slideshow, which is now called Making Inferences. Download in Powerpoint or PDF version.

Updated August 8, 2004. New version of slideshow has a more extensive treatment of clinical interpretation of confidence limits, as well as a more succinct critique of statistical significance. This version was presented in a minisymposium at the annual meeting of the American College of Sports Medicine in Indianapolis, June 5 2004.

Updated March 6, 2003. The slideshow now contains something on Cohen's smallest worthwhile effects, a slide showing use of the spreadsheet, and a few cosmetic improvements.

Updated Nov 3, 2002. The spreadsheet for confidence limits now automatically displays the qualitative probabilities corresponding to the quantitative probabilities in the above table.

Updated Oct 29, 2002. Another candidate to convey precision of estimation is 68% limits, which define a confidence interval approximately half as wide as the 95% confidence interval (for normally distributed effect statistics). These are also possible limits, according to the above table of probabilities. We could also use 90% limits, which would be likely or probable limits.

Reference: Hopkins WG (2002). Statistical vs clinical or practical significance [Slideshow]. Sportscience 6, sportsci.org/jour/0201/Statistical_vs_clinical.ppt (updated December 2006)