Lies, Damn Lies, and Stuff Missing From the Pre-Analysis Plan

There’s an old joke about a government project to find the answer to that great mathematical conundrum, 2 + 2 = ? When the economist gets his turn to answer, he whispers in the ear of the bureaucrat overseeing the project, “What do you want it to equal?” Social scientists have an unsavory reputation for massaging data, but now some of them are using a new mechanism to keep their analyses on the straight and narrow.

It’s not hard to see why some social scientists might stray from the path of rigor. With funding for research under threat and university salaries falling in real terms for several years, there is intense pressure to garner grants and publish papers. The problem is that academic journals tend to prefer studies whose results are statistically significant, except when an insignificant result would confirm conventional wisdom. So professors who want to see their work in print may be tempted to keep churning through the numbers until they find relationships that confirm their intuition, even when the relationships are essentially spurious.

One way to resist this temptation is to commit to using a predetermined battery of statistical tests before the data are even collected. By telling the world how they intend to use their data, social scientists can avoid accusations of “mining” or “fishing” from their colleagues and critics. This declaration can take the form of a public pre-analysis plan or the registration of hypotheses with an independent archive.

The presence of a pre-analysis plan can strongly affect the perception and production of statistical results. As Donald Green points out, people are much more likely to accept a result as significant when their prior beliefs don’t include a suspicion of fishing by the researcher. And in a recent paper on institutional reform in Sierra Leone, Katherine Casey, Rachel Glennerster, and Edward Miguel show how failing to use a pre-analysis plan might have led them to two different – and wrong – sets of conclusions.

Of course, there are downsides to pre-analysis plans. As researchers get to know their data, they often learn why their initial hypotheses are unlikely to be true and form new hypotheses based on a deeper understanding of underlying processes. A pre-analysis plan would prohibit them from testing these new hypotheses… unless they abandoned the research and started over with a new pre-analysis plan.

Clearly, integrity and sensitivity to potential biases still have a role to play. But to the extent these tools promote transparency in the social sciences – and indeed in any of the sciences – they will increase the public’s trust in research, and thus the power of research to improve our lives.

Image courtesy of Shutterstock.