Some papers about p values

Jump to follow-up

These papers have nothing much to do with single molecule kinetics. They were written by David Colquhoun after his retirement from the world of single ion channels, as a way to keep him off the streets. They are listed here as a convenient place to keep a record.

The papers concern the misinterpretation of tests of significance. Such tests were barely ever used in our single ion channel work. They represent a return to the interest of DC in statistical inference that he had in the 1960s, and which culminated on the publication of a textbook, Lectures on Biostatistics (OUP, 1971). The textbook has aged quite well, with the exception of the parts on interpretation of p values. In the 1960s, I missed entirely the problems of null hypothesis significance testing. But better late than never.

The problem lies in the fact that most people still think that the p value is the proBability that your results occurred by chance -see, for example. Gigerenzer et al.,(2006) [download pdf]. It is nothing of the sort.

The false positive risk (FPR) is the probability that a result that has been labelled as “statistically significant” is in fact a false positive. It is always bigger than the p value, often much bigger.

My recommendations. In brief, I suggest that p values and confidence intervals should still be cited, but they should be supplemented by a single number that gives an idea of the false positive risk (FPR). The simplest way to do this is to calculate the false positive risk that corresponds to a prior probability of there being a real effect of 0.5. This would still be optimistic for implausible hypotheses but it would be a great improvement on p values. The FPR, calculated in this way is just a more comprehensible way of citing likelihood ratio (see 2018 paper).

Please note: the term “false discovery rate”, which was used in earlier papers, has now been replaced by “false positive risk”. The reasons for this change are explained in the introduction of the 2017 paper.

Original papers about the problem


Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science This first paper looked at the risk of false positive results by simulatian of Student’s t test. The advantage of simulation is that it makes the assumptions very clear without much mathematics. The disadvantage is that the results aren’t very general.

Colquhoun, D. (2017). The reproducibility of research and the misinterpretation of p-values . Royal Society Open Science. This paper gives, in the appendix, mathematically exact solutions or the false positive risk, calculated by the p-equals method. This allows the false positive risk to be calculated, as a function of the observed p value, for a range of sample sizes. A web calculator is provided that makes the calculations simple to do.

Colquhoun, D. and Longstaff, C. (2017). Web calculator for false positive risk http://fpr-calc.ucl.ac.uk/

Colquhoun, D. (2018) The false positive risk: a proposal concerning what to do about p values, American Statistician, in press. Full text available at https://arxiv.org/abs/1802.04888. This paper examines more closely than before the assumptions that are made in calculations of FPR. It makes concrete proposals about how to solve the problem posed by the inadequacy of p values, with examples.

Popular accounts of the problem

Colquhoun, D. (2015) False discovery rates and P values: the movie. On YouTube. This slide show is now superseded by the 2018 version.

Colquhoun, D, (2015). The perils of p-values. In Chalkdust magazine. Available at http://chalkdustmagazine.com/features/the-perils-of-p-values/. Chalkdust is a magazine run by students at UCL. This article deals with the principles of randomisation tests as a non-mathematical way to get p values, plus a bit about what’s wrong with p values.

Colquhoun,D.(2015) Randomisation tests. How to get a P value with no mathematics. A short (6 slides, 15 min) video on YouTube. Forget t tests. The randomisation test is at least as powerful and it makes no assumption of normal distributions. Furthermore it makes very clear the fact that random allocation of treatments is an essential assumption for all tests of statistical significance. Of course the result is just a p value. It doesn’t tell you the probability that you are wrong: for that, see the other stuff on this page.

Colquhoun, D.(2016). The problem with p-values. Aeon magazine. (This attracted 147 comments.
This essay is about the logic of inductive inferencee. It is a non-mathematical introduction to the ideas raised in my 2014 paper.

Colquhoun, D. (2017) Five ways to fix statistics. State false positive risk, too. Nature, volume 251. A collection of short comments by five authors on what should be done about p values.

Colquhoun, D. (2018). The false positive risk: a proposal concerning what to do about p-values. This video is a slightly extended version of a talk that I gave at the Evidence Live meeting, June 2018, at the Centre for Evidence-Based Medicine, Oxford. It supersedes my earlier 2015 video on the same topic. It is an exposition of the ideas that are given in more detail in the 2017 paper and in the 2018 paper.

Follow-up