Hard to credit though it is, researchers keeping hacking data until they get the number they need
Adrian Barnett (QUT) and Jonathan Wren (Oklahoma Medical Research Foundation) looked at over one 1.3m confidence intervals from research paper abstracts and full-research, to find “an excess of statistically significant results.”
What they did: They used text-mining algorithms to find patterns in reporting of confidence intervals in research above and below the statistically significant p-value of 1.
What they found: “There was a steep increase in the number of lower interval limits just above 1, meaning they were just above the threshold for statistical significance. Similarly, there was a steep increase in the number of upper interval limits just below 1 and so just inside the statistically significant threshold.”
What it means: p-hacking is what. Professor Barnett tells CMM, “The basic idea of p-hacking is that researchers repeat their analysis until they get a result with p under 0.05. Then they present this result as if it was the only analysis they did. This is scientifically dishonest and creates a huge bias in the evidence base. There are multiple villains. Researchers feel pressure to do this in order to keep their jobs. The journals and peer reviewers prefer to publish ‘interesting’ results.
What should be next: “Statistical significance should no longer be used as a tool to screen what results are published and the evidence base would be in a better state if significance were given far less prominence,” Barnett and Wren write.
“We need a massive change in attitudes and practice in order to change and provide a more honest and useful picture of what drugs, procedures, interventions, etc. really work,” Professor Barnett says.