: It achieves a high success rate because LLMs are highly likely to follow instructions appearing at the very beginning of a prompt.
: The primary limitation is that it requires indirect prompt injection (placing hidden text in the source PDF), meaning it only works if the reviewer uploads the specific document to an AI tool. Detecting LLM-Generated Peer Reviews - arXiv
: By injecting these "hidden instructions" into a paper's PDF, editors can detect if a reviewer used AI. If the generated review begins with one of these 109,989 unique citations, it is statistically likely to be AI-generated. Review of the Framework 109989
: The system prompts an LLM to start its review with a specific phrase, such as: "Following [Surname] et al. ([Year]), this paper..." .
The topic originates from a 2025 study on Detecting LLM-Generated Peer Reviews . Researchers developed a watermarking system that uses fabricated citations to flag reviews created by AI instead of human experts. : It achieves a high success rate because
As a tool for academic integrity, this framework offers several notable advantages and limitations based on the study findings :
: The framework provides strong statistical guarantees, maintaining a low "family-wise error rate" (FWER), which prevents human-written reviews from being falsely flagged as AI. If the generated review begins with one of
Based on recent research regarding the detection of AI-generated content, refers to a specific dataset of 109,989 possible watermarks used to identify peer reviews written by Large Language Models (LLMs). Overview of Topic 109989