Decision Qualities of Bayes Factor and p Value-Based Hypothesis Testing
The purpose of this article is to investigate the decision qualities of the Bayes factor (BF) method compared with the p value-based null hypothesis significance testing (NHST). The performance of the 2 methods is assessed in terms of the false- and true-positive rates, as well as the false-discovery rates and the posterior probabilities of the null hypothesis for 2 different models: an independent-samples t test and an analysis of variance (ANOVA) model with 2 random factors. Our simulation study results showed the following: (a) The common BF > 3 criterion is more conservative than the NHST α = .05 criterion, and it corresponds better with the α = .01 criterion. (b) An increasing sample size has a different effect on the false-positive rate and the false-discovery rate, depending on whether the BF or NHST approach is used. (c) When effect sizes are randomly sampled from the prior, power curves tend to be flat compared with when effect sizes are prespecified. (d) The larger the scale factor (or the wider the prior), the more conservative the inferential decision is. (e) The false-positive and true-positive rates of the BF method are very sensitive to the scale factor when the effect size is small. (f) While the posterior probabilities of the null hypothesis ideally follow from the BF value, they can be surprisingly high using NHST. In general, these findings were consistent independent of which of the 2 different models was used.