P-hacking
Manipulation of data analysis to achieve statistically significant results, often by repeatedly testing different variables or subsets of data until desirable outcomes are found.
P-hacking, or data dredging, occurs when researchers conduct multiple statistical tests on data and selectively report those that yield significant p-values, typically less than 0.05. This practice undermines the integrity of scientific research by inflating the likelihood of Type I errors—false positives—where researchers incorrectly reject the null hypothesis. P-hacking exploits the flexibility in data analysis, such as choosing different covariates, altering sample sizes, or reclassifying data to achieve statistical significance. While it can arise from intentional misconduct, it often results from subconscious biases and pressures to publish significant findings. This undermines the reproducibility and reliability of scientific studies, leading to misleading conclusions and wasted resources.
The term "p-hacking" gained prominence in the early 2010s, though the practice itself has been a concern since the advent of statistical hypothesis testing in the mid-20th century. John Ioannidis's 2005 paper, "Why Most Published Research Findings Are False," highlighted the prevalence of biased research practices, bringing widespread attention to issues like p-hacking.
Key contributors to the discussion and awareness of p-hacking include John Ioannidis, whose work in the mid-2000s raised critical concerns about research reliability. The term was popularized by Andrew Gelman and Eric Loken through their 2013 paper "The Garden of Forking Paths," which elaborated on how various data analysis choices can lead to p-hacking. Their work, along with contributions from the field of meta-research, has been instrumental in advocating for more rigorous statistical practices and transparency in research.