Bayesian Mixture Modeling of Significant p Values: A Meta-Analytic Method to Estimate the Degree of Contamination From
Publication bias and questionable research practices have long been known to corrupt the published record. One method to assess the extent of this corruption is to examine the meta-analytic collection of significant p values, the so-called p-curve (Simonsohn, Nelson, & Simmons, 2014a). Inspired by statistical research on false-discovery rates, we propose a Bayesian mixture model analysis of the p-curve. Our mixture model assumes that significant p values arise either from the null-hypothesis ℋ0 (when their distribution is uniform) or from the alternative hypothesis ℋ1 (when their distribution is accounted for by a simple parametric model). The mixture model estimates the proportion of significant results that originate from ℋ0, but it also estimates the probability that each specific p value originates from ℋ0. We apply our model to 2 examples. The first concerns the set of 587 significant p values for all t tests published in the 2007 volumes of Psychonomic Bulletin & Review and the Journal of Experimental Psychology: Learning, Memory, and Cognition; the mixture model reveals that p values higher than about .005 are more likely to stem from ℋ0 than from ℋ1. The second example concerns 159 significant p values from studies on social priming and 130 from yoked control studies. The results from the yoked controls confirm the findings from the first example, whereas the results from the social priming studies are difficult to interpret because they are sensitive to the prior specification. To maximize accessibility, we provide a web application that allows researchers to apply the mixture model to any set of significant p values.