I have a confession to make… I used to hate stats from the bottom of my cold black heart! Now I just have a simmering dislike for the subject. Please don’t misunderstand, I love seeing the numbers and trying to understand them, but the process of how the numbers come about is considerably less appealing to me. However, over the past few months, I have gone down the “rabbit hole” of Kappa Coefficients for my MSc. I’ve come to realise that I can’t put my head in the sand, claim ignorance, and let others worry about stats. My tendency has always been to pay more attention to the “human” aspect of sport science. You might pick up on that trend in my writing, but this time, I want to briefly dip my toe into the icy waters of statistics. I’m still learning to crawl when it comes to all this, so the following info is largely aimed at less statistically developed readers. If there’s a sport science stats guru out there who wants to get involved with SPORT SCIENCE COLLECTIVE, your intellectual contributions would be welcome.
For now, you’re stuck with me. In the 2nd trial issue of SPORT SCIENCE COLLECTIVE, Dr Jason Tee explains the importance of making data-driven decisions when working with coaches or teams. It is also important that we are able to be discerning when reading research. There has been some debate around an apparently large proportion of published research being false[1–3]. A confounding factor in research is terminology. Through my own delving into stats and from researching for this article, I regularly find inconsistencies in the terminology. Are “reproducibility”, “reliability”, and “agreement” really interchangeable? I’ve also been a bit confused when I’ve read about effect statistics, effect measures, or effect sizes. Ultimately, an understanding of statistics and the terminology will help us to be discerning when reading research papers or interpreting our own data.
First up is the p-value, a common test statistic. Essentially used to determine if results are “statistically significant”, it comes under fire for being dichotomous (significant/not significant) and not really telling us much. The attributed significance is based on the null hypothesis (no effect in population), if it turns out unlikely that there is no effect in the sample, the result is regarded as significant. I hope that made sense. I find it a bit convoluted, to be honest. This raises the question of “statistical vs clinical significance?”. Something that is statistically significant must still be given real world significance and meaning. Beyond significance testing, we should concern ourselves with variance and effect statistics. I think it’s safe to skip over the basic centrality stats (e.g. mean) in the interest of the word count and your precious time.
Variance around the estimates can be inferred from Range, Percentile ranges, Standard Deviation (SD) and the Coefficient of Variation (CV). In a nutshell, smaller SD’s mean that the results are more closely grouped. Whereas, CV is simply SD expressed as a percentage. If you’re wondering about Standard Error of the Mean (SEM), Dr Will Hopkins reckons that it’s fairly pointless, and I’ll trust him on this. Now out of left field, Confidence Intervals (CI) are an important stat to understand but aren’t strictly a measure of variance. CI is an extension of SD and refers to the range in which we’re most likely to find the true value of the estimate. It provides us with information about the precision of the estimate. A wider range around the mean indicates a less precise result. Then there are effect statistics which include; difference between means, correlations, frequencies, ratios and more[4,5]. These stats provide us with insight into the magnitude and direction of the results. If you’re looking for guidelines for ascribing meaning to the magnitudes of effect statistics, then click here.
This was never meant to be an extensive explanation of the stats, but rather a suggestion as to the important stats for sport scientists to be familiar with. I tried to categorise groups of stats and the actual stats used in the table below. While we wait for a stats hero to arise, this will hopefully help my fellow stats novices to navigate this intimidating topic.
1. Ioannidis, J. P. A. Why most published research findings are false. PLoS Med. 2, e124 (2005).
2. Ioannidis, J. P. A. Why Most Published Research Findings Are False: Author’s Reply to Goodman and Greenland. PLoS Med. 4, e215 (2007).
3. Goodman, S. & Greenland, S. Why Most Published Research Findings are False: Problems in the Analysis. PLoS Med. e168 (2007).
4. Cumming, G. The new statistics: Why and how. Psychol. Sci. 25, 7–29 (2014).
5. Sullivan, G. M. & Feinn, R. Using Effect Size - or Why the P Value Is Not Enough. J. Grad. Med. Educ. 4, 279–82 (2012).