Scientific with a small s

My inspiration for this blog’s motto comes from Zilliak & McCloskey (2004). They quote from Bob Solow’s Nobel Prize acceptance speech, after which they write:

“Solow recommends we “try very hard to be scientific with a small s”; but the authors we have surveyed in the AER [American Economic Review, GM], by contrast, are trying to be scientific with a small t.” (p. 544).

Their “small t” refers to the t statistic on the basis of which researchers determine the p-values they use to assess the statistical significance of their findings. A small p (smaller than .05) is usually taken to mean that the test result is statistically significant.

There are a lot of reasons to believe that null-hypothesis significance testing (NHST) is basically unscientific. That’s why I got convinced that you cannot do science with a small p (significance testing). I hope that after reading the blog posts yet to come, you will be convinced as well.  (If you can’t wait: Kline (2014) (see below) is a good place to start getting convinced).

What does it mean to be scientific with a small s? To Solow (as cited in Zilliak & McCloskey, 2004) it simply means thinking logically and respecting the facts.  To my mind, thinking logically as a prerequisite of being scientific (with a small s) includes thinking logically about the results of statistical analyses. For instance, that you should not mistakenly believe that a small p value means that it is unlikely that a result is due to chance, or that you should not mistakenly believe that the long term behavior of a decision procedure has anything to do with the evidence in your actual data (the facts).

Zilliak & McCloskey (2004) write about economic research, but significance testing is of course not limited to economic research. Kline (2013, p. 118-199) concludes in his chapter about cognitive distortions in significance testing (and he is putting it mildly):

“Significance testing has been like a collective Rorschach inkblot test for the behavioral sciences: What we see in it has more to do with wish fulfillment than reality. This magical thinking has impeded the development of psychology and other disciplines as cumulative sciences. […] the gap between what is required for significance tests to be accurate and characteristics of real world studies is just too great.”

So, this blog is about being scientific with a small s, with a main focus on the logic and illogic of NHST, because you simply cannot do science with only a small p.

Kline, R.B. (2013). Beyond significance testing. Statistics reform in the behavioral sciences. Second Edition. Washington: APA.
Zilliak, S.T., & McCloskey, D.N. (2004). Size matters: the standard error of regressions in the American Economic Review, Journal of Socio-Economics, 33, 527-547.