Skip to content

PLOS is a non-profit organization on a mission to drive open science forward with measurable, meaningful change in research publishing, policy, and practice.

Building on a strong legacy of pioneering innovation, PLOS continues to be a catalyst, reimagining models to meet open science principles, removing barriers and promoting inclusion in knowledge creation and sharing, and publishing research outputs that enable everyone to learn from, reuse and build upon scientific knowledge.

We believe in a better future where science is open to all, for all.

PLOS BLOGS The Official PLOS Blog

The promise and perils of AI use in peer review

Author: Renee Hoch, Head of Publication Ethics, PLOS

Artificial intelligence (AI) is having an outsized influence on many industries, including scholarly publishing. The theme for this year’s Peer Review Week explores how we can rethink Peer Review in the AI Era. Publishers like PLOS are carefully considering how AI can enhance peer review without undermining research integrity.

The current standard in the scholarly publishing community is that authors may use generative AI (genAI) in preparing submissions, (with some caveats; see this COPE position statement and the PLOS policy). However, there are tight reins on AI use in peer review. Many journal policies specify that editors and reviewers must not upload submission content to genAI tools, and some either forbid genAI use in peer review altogether (e.g. Science), or allow select AI use cases, such as to translate or edit one’s own review comments (e.g. PLOS, Wiley).

Why this different standard? Policies restricting or prohibiting AI in peer review mitigate risks that genAI use by editors or reviewers could introduce, such as:

  • Breach of confidentiality for unpublished content and sensitive data
  • Loss of rigor and specificity in the assessment process
  • Fraudulent misrepresentation of genAI outputs and peer review contributors
  • Enablement & acceleration of peer review manipulation (e.g. by paper mills)

It may then seem paradoxical or even hypocritical that journals and publishers are exploring options for in-house AI usage in peer review. A key difference between in-house use (by journal staff) and external use (by academic editors and reviewers) is that a journal can deploy in-house tools in a controlled technology environment that protects data security, so that confidential content is not ingested into training sets which affect other users’ outputs.  

When data security measures are in place, AI can help improve the consistency with which journals enforce their standards and policies. For example, AI can detect and produce review reports querying issues such as incomplete, unverifiable, or retracted references, problematic statistical analyses, and non-adherence to data availability and pre-registration requirements. Human reviewers are inconsistent in the degree to which they address these types of issues which can directly impact integrity and reproducibility.

Although there are several good use cases for AI in supporting peer review, humans remain indispensable for providing rigorous content assessment. Whereas genAI detects and averages preexisting content, humans innovate and evaluate. We introduce new ideas and perspectives, bring creativity, curiosity, and intellect, and are able to synthesize, contextualize, interpret, and critique based on knowledge that spans multiple domains. In short, machines are a long way from being able to replicate human cognition, and so humans can engage in peer review and scientific discourse in a way that machines cannot. In practical terms, this means that people can identify issues that would not be evident to a machine reader or algorithm, and which can be crucial to scientific validity and integrity.

With that said, moving toward a hybrid human+AI peer review model could mitigate known pain points in peer review, including the heavy burden peer review places on academics and longer-than-ideal peer review timings. If AI covers technical aspects of the assessment, then perhaps we can use fewer reviewers to cover aspects of peer review that require uniquely human executive functioning capabilities. As a proof-of-concept for this model, a talk at the 2025 Peer Review Congress discussed a ‘Fast Track’ peer review offering by NEJM AI in which decisions are issued within one week of submission based only on the editors’ evaluation of the manuscript and two AI-generated reviews.

While one-week turnaround times are enticing, there are several reasons to include at least two human experts in peer review, whether as editors and/or reviewers. Authors and articles benefit from evaluations reflecting different (human) perspectives; it often requires multiple individuals to cover the subject matter and methodological expertise needed for a rigorous assessment. Importantly, having two or more humans involved in peer review also increases the likelihood that any major scientific and integrity issues will be identified, and lends greater credibility overall to publications and journals. It also affords a degree of protection for authors, journals, and the broader community against issues that could compromise peer review, such as personal biases, competing interests, poor quality assessments, and unethical (mis)use of peer review for personal gain. 

The AI era may be here to stay, and publishers and researchers will continue to explore its uses, but caution and careful consideration must be given in every step of the peer review process. And, at the end of the day, it will never replace a person’s expertise and judgement.

Further Reading:

Related Posts
Back to top