In my article, I wrote: "Whether you accept or reject the null hypothesis is then based upon how big the probability is that the given hypothesis is true. That's the way basic statistical tests work."

Efrique responds:

This is not right (you've strayed into my area, so I am probably going to get more technical than required)

A p-value is the probability of a test statistic at least as extreme as the one observedgiven that the null hypothesis is TRUE. This is entirely different from the probability that the null is true:

Something like p( T >= t | H0) is not at all the same thing as P(H0) or even P(H0 | T = t)

[This is an extremely common mistake, so it's probably one you've seen elsewhere - and more than once. I've even seen this error committed in statistical textbooks. It's one of several tests I use to reject a text as inadequate; it takes all of a minute or two to check if a text does these kind of things, but it's generally very telling of the overall standard.]

(If you're a Bayesian, you could argue that a p-value is related to the probability that the null is true given a result at least as extreme (proportional to it via Bayes theorem), but a Bayesian generallywon't calculate a p-value at all, since a Bayesian can just jump straight to the relative posterior probability for the two hypotheses, or more likely would cast the whole thing in terms of decision theory and simply start computing some loss function or other quantity related to some measure of utility or disutlity. The people that calculate p-values - frequentists, generally - will deny that it makes sense to talk about relative probability of the null and alternative {holding that they're either true or false}, and hence the p-value simply has no interpetation even as a quantity proportional to such a probability.)

For a more in-depth discussion of the nuances of null hypothesis significance testing (and why it's probably inadequate), you could check out the book

ReplyDeleteScientific Method in Practiceby Hugh G. Gauch.tl;dr version here: http://web.math.umt.edu/wilson/Math444/Handouts/Cohen94_earth%20is%20round.pdf

The trouble with Bayesian hypothesis testing of course is that, while it's conceptually more straightforward and probably more useful than NHST, the computational details are trickier. There also aren't as many good intro textbooks for it.

So, with my modest mathematical abilities, I find myself accepting Bayesian over frequentist inference intellectually, but not handy with the nitty-gritty of carrying it out. Not a good place to be.