Journal article Open Access

What p -hacking really looks like: A comment on Masicampo and LaLande (2012)

Lakens, Daniël

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Other (Open)</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2014-12-06</subfield>
  <controlfield tag="005">20200120172911.0</controlfield>
  <controlfield tag="001">235811</controlfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Masicampo and Lalande (2012; M&amp;amp;L) assessed the distribution of 3627 exactly calculated p-values between 0.01 and 0.10 from 12 issues of three journals. The authors concluded that "The number of p-values in the psychology literature that barely meet the criterion for statistical significance (i.e., that fall just below .05) is unusually large". "Specifically, the number of p-values between .045 and .050 was higher than that predicted based on the overall distribution of p."
There are four factors that determine the distribution of p-values, namely the number of studies examining true effect and false effects, the power of the studies that examine true effects, the frequency of Type 1 error rates (and how they were inflated), and publication bias. Due to publication bias, we should expect a substantial drop in the frequency with which p-values above .05 appear in the literature. True effects yield a right-skewed p-curve (the higher the power, the steeper the curve, e.g., Sellke, Bayarri, &amp;amp; Berger, 2001). When the null-hypothesis is true the p-curve is uniformly distributed, but when the Type 1 error rate is inflated due to flexibility in the data-analysis, the p-curve could become left-skewed below pvalues of .05.
M&amp;amp;L (and others, e.g., Leggett, Thomas, Loetscher, &amp;amp; Nicholls, 2013) model pvalues based on a single exponential curve estimation procedure that provides the best fit of p-values between .01 and .10 (see Figure 3, right pane). This is not a valid approach because p-values above and below p=.05 do not lie on a continuous curve due to publication bias. It is therefore not surprising, nor indicative of a prevalence of p-values just below .05, that their single curve doesn't fit the data very well, nor that Chi-squared tests show the residuals (especially those just below .05) are not randomly distributed.</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">185894</subfield>
    <subfield code="z">md5:40aeefe4dde04fb47d04d3e8c21552f3</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="0">(orcid)0000-0002-0247-239X</subfield>
    <subfield code="a">Lakens, Daniël</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1080/17470218.2014.982664</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">What p -hacking really looks like: A comment on Masicampo and LaLande (2012)</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
Views 396
Downloads 207
Data volume 38.5 MB
Unique views 382
Unique downloads 200


Cite as