AdobeStock_179336206.jpeg
VoxEU Column Frontiers of economic research

The null result penalty

There are growing concerns about publication bias in academic studies, particularly against papers with small effects that are not statistically significant. Using a large-scale survey of academic economists, this column finds a substantial perceived penalty against null results. Respondents believe studies with a null result have a lower chance of being published and perceive those studies as having lower quality. Further analysis suggests the communication of statistical uncertainty and perceptions of statistical precision are important factors affecting the null result penalty, but finds no evidence of a bias towards surprising findings in the publication process.

Scientists test hypotheses with empirical evidence (Popper 1934). This evidence accumulates with the publication of studies in scientific journals. The expansion of scientific knowledge thus requires a publication system that evaluates studies without systematic bias. Yet, there are growing concerns about publication bias in scientific studies (Brodeur et al. 2016, Simonsohn et al. 2014). Such publication bias could arise from the publication system punishing research papers with small effects that are not statistically significant. The resulting selection could lead to biased estimates and misleading confidence intervals in published studies (Andrews and Kasy 2019).

Large-scale surveys with academic economists

In a new paper (Chopra et al. 2022), we examine whether there is a penalty in the publication system for research studies with null results and if so, what mechanisms lie behind the penalty. To address these questions, we conduct experiments with about 500 economists from the leading top 200 economics departments in the world.

The researchers in our sample have rich experience as both producers and evaluators of academic research. For example, 12.7% of our respondents are associate editors of scientific journals, and the median researcher has an H-index of 11.5 and 845 citations on Google Scholar. This allows us to study how experienced researchers in the field of economics evaluate research studies.

In the experiment itself, these researchers were given the descriptions of four hypothetical research studies. Each description was based on an actual research study by economists, but we modified some details for the purpose of our experiment. The description of the study included information about the research question, the experimental design (including the sample size and the control group mean), and the main finding of the study.

Our main intervention varies the statistical significance of the main finding of a research study, holding all other features of the study constant. We randomised whether the point estimate associated with the main finding of the study is large (and statistically significant) or close to zero (and thus not statistically significant). Importantly, in both cases, we keep the standard error of the point estimate identical, which allows us to hold the statistical precision of the estimate constant.

How does the statistical significance of a research study's main finding affect researchers’ perceptions and evaluations of the research study? To find out, we asked our respondents how likely they think it is that the research study would be published in a specific journal if it was submitted there. The journal was either a general interest journal (like the Review of Economic Studies) or an appropriate top field journal (like the Journal of Economic Growth). In addition, we measured their perception of the quality and importance of the research study.

Is there a null result penalty?

We find evidence for a substantial perceived penalty against null results. The researchers in our sample think that research studies with null results have a 14.1 percentage points lower chance of being published (Panel A of Figure 1). This effect corresponds to a 24.9% decrease relative to the scenario where the study at hand would have yielded a statistically significant finding.

In addition, researchers hold more negative views about studies that yielded a null result (Panel B of Figure 1). The researchers in our experiment perceive those studies to have 37.3% of a standard deviation lower quality. Studies with null results are also rated by our respondents to have 32.5% of a standard deviation lower importance.

Does experience moderate the null results penalty? We find that the null result penalty is of comparable magnitude for different groups of researchers, from PhD students to editors of scientific journals. This suggests that the null result penalty cannot be attributed to insufficient experience with the publication process itself.

Figure 1 The null-result penalty

  

Mechanisms

Why do researchers perceive studies with findings that are not statistically significant to be discounted in the publication process? Additional features of our design allow us to examine three potential factors.

Communication of uncertainty

Could the way in which we communicate statistical uncertainty affect the size of the null result penalty? In our experiment, we cross-randomised whether researchers were provided with the standard error of the main finding or the p-value associated with a test for whether the main finding is statistically significant. This treatment variation is motivated by a longstanding concern in the academic community is that the emphasis on p-values and tests of statistical significance could contribute to biases in the publication process (Camerer et al. 2016, Wasserstein and Lazar 2016). We find that the null result penalty is 3.7 percentage points larger when the main results are reported with p-values, thus demonstrating that the way in which we communicate statistical uncertainty matters in practice.

Preference for surprising results

Our respondents might think that the publication process values studies with surprising findings relative to the prior in the literature. Indeed, Frankel and Kasy (2022) show that publishing surprising results is optimal if we want journals to maximise the policy impact of published studies. Such a mechanism could potentially explain the null result penalty if researchers perceive a large penalty for null results studies that are not surprising to experts in the field. To examine this, we randomly provide some of our respondents with an expert forecast of the treatment effect. We randomise whether experts predict a large effect or predict an effect that is close to zero. We find that the null result penalty is unchanged when respondents were given the information that experts in the literature predicted a null result. However, once experts predict a large effect, the null results penalty increases by 6.3 percentage points. These patterns suggest that the penalty against null results cannot be explained by researchers believing that the publication process favours surprising results, as they should have evaluated null results that were not predicted by experts more positively in this case.

Perceived statistical precision

Finally, we investigate the hypothesis that null results might be perceived as being more noisily estimated – even when holding constant the objective precision of the estimate. To test this hypothesis, we conducted an experiment with a sample of PhD students and early career researchers. The design and the main outcome of this experiment are identical to our main experiment, but we replace the questions about quality and importance with a question about the perceived precision of the main finding. We also find a sizeable null results penalty in this more junior sample of researchers. In addition, we find that null results are perceived to have 126.7% of a standard deviation lower precision, despite the fact that we fixed respondents’ beliefs about the standard error of the main finding (Panel B of Figure 1). This suggests that researchers might employ simple heuristics to gauge the statistical precision of findings.

Broader implications

Our findings have important implications for the publication system. First, our study highlights the potential value of pre-results review in which research papers are evaluated before the empirical results are known (Miguel 2021). Second, our results suggest that additional guidelines on the evaluation of research which emphasise the informativeness and importance of null results (Abadie 2020) should be provided to referees. Our study also has implications for the communication of research findings. In particular, our results suggest that communicating statistical uncertainty of estimates in terms of standard errors rather than p-values might alleviate a penalty for null results. Our findings contribute to a broader debate on challenges of the current publication system (Angus et al. 2021, Andre and Falk 2021, Card and DellaVigna 2013, Heckman and Moktan 2018) and potential ways to improve the publication process in economics (Charness et al. 2022).

References

Abadie, A (2020), “Statistical nonsignificance in empirical economics”, American Economic Review: Insights 2(2): 193–208.

Andre, P and A Falk (2021), “What’s worth knowing in economics? A global survey among economists”, VoxEU.org, 7 September.

Andrews, I and M Kasy (2019), “Identification of and correction for publication bias”, American Economic Review 109(8): 2766-94.

Angus, S, K Atalay, J Newton and D Ubilava (2021), “Editorial boards of leading economics journals show high institutional concentration and modest geographic diversity”, VoxEU.org, 31 July.

Brodeur, A, M Lé, M Sangnier and Y Zylberberg (2016), “Star wars: The empirics strike back”, American Economic Journal: Applied Economics 8(1): 1-32.

Camerer, C F, A Dreber, E Forsell, T-H Ho, J Huber, M Johannesson, M Kirchler, J Almenberg, A Altmejd, T Chan, E Heikensten, F Holzmeister, T Imai, S Isaksson, G Nave, T Pfeiffer, M Razen and H Wu (2016), “Evaluating replicability of laboratory experiments in economics”, Science 351(6280): 1433–1436.

Card, D and S DellaVigna (2013), “Nine facts about top journals in economics”, VoxEU.org, 21 January.

Charness, G, A Dreber, D Evans, A Gill and S Toussaert (2022), “Economists want to see changes to their peer review system. Let’s do something about it”, VoxEU.org, 24 April. 

Chopra, F, I Haaland, C Roth and A Stegmann (2022), “The Null Result Penalty”, CEPR Discussion Paper 17331.

Frankel, A and M Kasy (2022), “Which findings should be published?”, American Economic Journal: Microeconomics 14(1): 1-38.

Heckman, J and S Moktan (2018), “Publishing and promotion in economics: The tyranny of the Top Five”, VoxEU.org, 1 November.

Miguel, E (2021), “Evidence on research transparency in economics”, Journal of Economic Perspectives 35(3): 193-214.

Popper, K (1934), The logic of scientific discovery, Routledge.

Simonsohn, U, L D Nelson and J P Simmons (2014), “p-curve and effect size: Correcting for publication bias using only significant results”, Perspectives on Psychological Science 9(6): 666-681.

Wasserstein, R L and N A Lazar (2016), “The ASA Statement on p-Values: Context, Process, and Purpose”, The American Statistician 70(2): 129–133.

315 Reads