Economists want to see changes to their peer review system. Let’s do something about it.

Gary Charness, Anna Dreber, Daniel Evans, Adam Gill, Severine Toussaert 24 April 2022

a

A

Economics is a dynamic field, but its approach to peer review has remained relatively static. Moreover, while recent efforts at understanding the publication process in economics have been fruitful (Galiani and Panizza 2020), there is still much we do not know about peer review in particular. This omission is not immaterial, given the central role peer review plays in our discipline. For individual researchers, top five publications are a key input into hiring and tenure decisions (Heckman and Moktan 2020). More broadly, peer review can systematically influence the research direction of the field. For example, the use of pre-publication peer review might discourage the pursuit of daring research in favour of safer topics, relative to post-publication peer review (Gross and Bergstrom 2021). This finding has particular salience in light of suggestions that economics should re-evaluate its norms around the production of research (Akerlof 2020; Andre and Falk 2021). 

To investigate the current state of our system, we surveyed over 1,400 economists between July 2020 and January 2021 about their experiences with and attitudes towards peer review. Our respondents were fairly representative of the underlying population of academic economists (but deviated most notably on the dimensions of field and location, with more experimental/behavioural economists and researchers based in Europe). We asked about their recent submission and peer review activity, their perspectives as referees and authors, and their opinions about proposals for reform. In this column, we use our survey data to evaluate the peer review system in economics and consider a range of proposals potentially responsive to the issues identified. Our findings are presented in greater detail in a recent report (Charness et al. 2022) and on our website (https://evalresearch.weebly.com/). 

The ecosystem of peer review

We start by documenting the peer review activity of our respondents. In the two years prior to the survey, our respondents made a mean (median) of 3.5 (3) journal submissions per year. A rightward skew is visible in this distribution, with approximately 20% of authors making 50% of submissions (Figure 1). As referees, our respondents wrote a mean (median) of 10.2 (8) reports per year. This distribution is similarly skewed, with around 15% of referees responsible for 40% of all reports written (Figure 2).

Figure 1 Distribution of annual number of submissions   

Note: Figure shows the distribution of the annual number of submissions made by respondents, averaged over two years. Each submission of the same paper to a different journal counts as a separate submission. N = 1,484

Figure 2 Distribution of annual number of reports written

Note: Figure shows the distribution of the annual number of reports written by our respondents. N = 1,483.

The most active referees tend to be more senior and located in the US/Canada, and they often have editorial experience. There is heterogeneity in which journals respondents referee for: almost half of them write no reports for a top five journal, while the top 25% of respondents write approximately 80% of top five reports (Figure 3). We estimate that the average (median) respondent spends 12 (9) working days per year on refereeing. The top 10% of the distribution dedicates 25 working days or more, which is quite substantial considering refereeing is usually unpaid.

We also asked our respondents some general questions about the aims of peer review. As authors, what they most expect from peer review is “a reasonable and well-substantiated decision”, with lesser weight given to timeliness and receiving feedback. Meanwhile, reports should primarily “help the editor reach an informed decision”, with limited emphasis placed on giving comments.

Figure 3 Lorenz curves for reports written, separated by journal type

Notes: Figure shows the Lorenz curves of reports written by respondents, separated by journal type. On the x-axis, respondent percentiles are based on the volume of reports they write. The y-axis gives the cumulative proportion of reports written by respondents at or below each percentile. For instance, respondents above the 75th percentile are responsible for 80% of reports written for top five journals (blue line). The Gini coefficients, which measure the extent of inequality in the distribution, are stated in parentheses. N = 1,483.

Allocating papers to journals and referees

In this section, we evaluate the process of allocating manuscripts to referees and journals. In particular, we ask whether our system performs satisfactorily in terms of quantity (distribution of workload) and quality (matching on relevant topics, skills, and an absence of conflicts of interest).

The first step in the allocation process is journal submission. Since researchers have strong incentives to publish in a top five journal (Heckman and Moktan 2020), some of them may ‘shop around’ between elite journals until one (hopefully) accepts them. Consistent with that, submission volumes at the American Economic Review (AER) and Econometrica have grown by approximately 37.5% and 61.3% from 2010 to 2020, respectively (Charness et al. 2022). 

After submission, editors request reports from some researchers disproportionately often. Tenured professors tend to write more reports than they think is reasonable, while PhD candidates and postdocs write fewer. In fact, only about 25% of respondents write the number of reports they think is reasonable (Figure 4). These findings suggest that some refereeing work could be reallocated to early-career researchers who would benefit from additional exposure to the system. In addition, there is evidence that prestigious refereeing and editorial opportunities are highly concentrated among researchers at elite universities, especially those in North America (Angus et al. 2021).

We also compare researchers’ report-writing volume with their submission volume. The average respondent writes 2.8 reports per submission (Figure 5), with about 40% of respondents writing three times as many reports, and around 15% contributing fewer reports than papers.

Figure 4 Actual vs reasonable number of referee reports written      

Notes: Figure compares the number of reports respondents write annually (y-axis) with the number they consider reasonable for themselves to write (x-axis). 4 outliers were removed for clarity (N = 1,474). The blue line is a visualization of the regression of actual on reasonable number of reports written (slope coefficient of 1.16; intercept of -0.26; N = 1,478, outliers included). 

Figure 5 Ratio of reports to submissions by characteristics 

Note: Figure shows the ratios of annual reports written to annual submissions for different subgroups, with 57 outliers removed for clarity (N = 1,402, without outliers).

Besides concerns around the distribution of the workload, there are also questions about the quality of allocations. Approximately 60% of respondents rejected more than one in ten referee requests, while 30% rejected one in four or more; tellingly, around half of those who considered rejecting a request said the paper was “too remote from [their] research field”. One in six also mention a conflict of interest as a reason to consider rejecting a request. There is disagreement about how to handle conflicts involving referees, with around 40% arguing that referees should never review manuscripts from friends and co-authors, and another 40% saying that it should happen “as little as possible but cannot be avoided sometimes.”

Among the proposals we considered to address these issues, we note widespread support for the creation of a centralized platform that would collect information about reviewers’ current workloads, research interests, and networks. Editors could use the platform (perhaps a simple website) to expand referee pools, improve match quality, and avoid conflicts of interest.

Content of the reports

In its guidelines, Econometrica asks referees to be professional, give precise suggestions, and avoid “gratuitous or irrelevant criticisms.” Do reports tend to live up to this standard? Our respondents generally perceive the reports they receive to be of heterogenous quality (Figure 6). Most often, low-quality reports are characterised by “inaccurate statements about what the paper does”. By contrast, reports that our respondents found useful offer comments that “clarify the contribution of the paper” and “improve the existing analysis”. Suggestions for robustness checks were not as appreciated.

Our respondents thought that clearer guidelines and doctoral training would improve report quality. Women and early-career researchers seem particularly supportive of doctoral training (Figure 7). In general, respondents also support feedback mechanisms from editors to referees, such as sharing decisions and systems for editors to grade referees. Most respondents supported a formal process for authors to appeal decisions they disagree with.

Figure 6 Perceived quality of reports received      

Note: Figure shows the mean percentage of reports respondents received at each perceived level of quality. N = 1,459.                                           

Figure 7 Views on doctoral training 

Note: Figure shows the distribution of opinions about the usefulness of doctoral training for improving referee reports, split by respondent subgroup.  N = 1,459.

Reviewing process and decision times

Recent work has demonstrated that the peer review process in economics is unusually slow, both in comparison to other disciplines and to its own past performance (Hadavand et al. 2021, Huisman and Smits 2017). Increasing pressures from growing submission volumes and other demands seem likely to exacerbate this issue. In this section, we investigate delays at one specific stage of the process, i.e. the amount of time taken by referees to return their reports.

Two-thirds of respondents report being late at least once in the previous two years, with a median delay of 1–2 weeks (Figure 8). Interestingly, respondents with more delays do not reject requests at a higher rate, even if they write more reports than they deem reasonable. A similar number of respondents said that 4, 6, and 8 weeks are appropriate lengths of time to return a report. Notably, many respondents are late even according to their own ideal deadlines.

When it comes to reducing delays, most respondents support the practice of desk-rejecting obviously unfit manuscripts. Three-quarters of respondents believe their referees would perform better if they were better rewarded, including with non-monetary rewards. Finally, respondents overwhelmingly supported the short-paper, accept-or-reject, one-revision round format of AER: Insights (Figure 9).

Figure 8 Percentage of time respondents had a delay    

Notes: Figure shows the distribution of how often respondents were late in turning in their peer review reports. N = 1,483.                                            

Figure 9 Support for the AER: Insights model

Note: Figure shows the distribution of opinions towards more journals adopting submission and review policies similar to those of AER: Insights, split by respondent subgroup. N = 1,459.

Innovations in peer review

In this section, we explore attitudes towards potential innovations in peer review. Respondents were quite sceptical about a policy where “referees sign their reports and the entire review history […] is disclosed” (Figure 10). They were more open to making reports public in an anonymized way, with junior respondents being the most favourable (Figure 11). However, respondents appeared uncertain about the benefits of this policy, suggesting journals might want to experiment further in this area.

Figure 10 Attitudes toward open peer review (for all referees)       

          

Notes: Figure shows the distribution of opinions towards an open peer review system that applies to all referees, split by respondent subgroup. Here, “open peer review” refers to referee reports and referee identities being published alongside the manuscript at the conclusion of the peer review process. N = 1,459.

Figure 11 Support for disclosing the review history

Note: Figure shows the distribution of attitudes towards disclosing the entire review history of a paper in an anonymized way, again split by respondent subgroup. N = 1,459.

Journals could also experiment with proposals that are more transformative in nature. For example, we could reconsider who should conduct peer review: should reports be solicited from ‘outsiders’ to economics (e.g. policymakers or researchers from other disciplines)? We could also ask when peer review should happen: should we expand the use of pre- and post-submission feedback and evaluation? While we could not survey our respondents on these topics, we hope to see more discussions about them in the near future. 

Conclusion

In this column, we make extensive use of the survey data collected for our report. We acknowledge that our data and analysis have several limitations, including the representativeness of our sample and possible measurement error. Even so, we hope that we have stirred your interest in reforming peer review. A more complete treatment of these (and other) issues can be found in our report, and we invite you to join the dialogue on our website’s discussion forum (https://evalresearch.weebly.com/discussion.html).

References

Akerlof, G (2020), “Sins of Omission and the Practice of Economics,” Journal of Economic Literature 58(2): 405-418.

Andre, P and A Falk (2021), “What’s worth knowing in economics? A global survey among economists,” VoxEU.org, 7 September.

Angus, S, K Atalay, J Newton, amd D Ubilava (2021), “Editorial boards of leading economics journals show high institutional concentration and modest geographic diversity,” VoxEU.org, 31 July.

Charness, G, A Dreber, D Evans, A Gill, and S Toussaert (2022), Improving Peer Review in Economics: Stocktaking and Proposals, Technical Report.

Galiani, S and U Panizza (2020), Publishing and Measuring Success in Economics, CEPR Press.

Gross, K and C Bergstrom (2021), “Why ex post peer review encourages high-risk research while ex ante review discourages it”, Proceedings of the National Academy of Sciences 118(51).

Hadavand, A, D Hamermesh, and W Wilson (2021), “Publishing Economics: How Slow? Why Slow? Is Slow Productive? Fixing Slow?”, NBER Working Paper 29147.

Heckman, J and S Moktan (2018), “Publishing and promotion in economics: The tyranny of the Top Five,” VoxEU.org, 1 November.

Huisman, J and J Smits (2017), “Duration and quality of the peer review process: the author’s perspective,” Scientometrics 113: 633-650.

a

A

Topics:  Frontiers of economic research

Tags:  peer review, economic research, academia, economics journals

Professor of Economics and Director of the Experimental and Behavioral Economics Laboratory, University of California, Santa Barbara

Johan Björkman Professor of Economics, Stockholm School of Economics

PhD student in Economics, Bonn Graduate School of Economics.

PhD student in Economics, Uppsala University

Associate Professor of Economics, University of Oxford

Events

CEPR Policy Research