blog.neurips.cc via Reddit

NeurIPS Rejects 18.4% of Position Papers via Pangram AI Tool

By Alexis Dufresne Published June 3, 2026 at 18:34 UTC

education ai detection safety ai-detection academic-publishing peer-review

Key insights

NeurIPS desk-rejected 178 papers (18.4%) via Pangram; 123 more must prove human authorship by June 15, 2026.
Default Pangram window sizes flagged 42.7% of submissions; refined 100-word windows reduced that rate to 12.7%.
Comparison venues FAccT 2025 and NeurIPS E&D 2026 showed 0% and 2.1% maximum AI scores respectively.

Why this matters

The use of a proprietary AI detection tool as an automatic desk-rejection gatekeeper at NeurIPS establishes a precedent where algorithmic score thresholds, not editorial review, determine paper fate for nearly one in five submissions. The 30-point swing between Pangram's default (42.7%) and refined (12.7%) flagging configurations reveals that rejection outcomes are highly sensitive to detection window choices that submitting authors have no visibility into. Researchers whose AI use falls squarely within the explicitly permitted scope of copyediting must now reconstruct multi-stage version histories just to remain in review, reversing the default burden of proof onto authors.

Summary

NeurIPS 2026 Position Paper Track chairs Alex Lu, Seth Lazar, and David Rugamer desk-rejected 178 submissions (18.4%) via Pangram AI detection scores, with no appeal available for those papers. Another 123 papers (12.7%) received conditional notice: document pre-AI and post-AI writing checkpoints by June 15, 2026, or face rejection. Three separate thresholds drove the 178 cuts: Pangram scores above 0.9 (77 papers), scores above 0.8 paired with solo-authorship flags (79 papers), and scores above 0.5 where authors explicitly denied any AI use (22 papers). Essentially: NeurIPS chairs and Pangram replaced editorial judgment with layered algorithmic thresholds for nearly one in five submissions. - Conference policy allows AI only for "copy-editing or similar peripheral changes to the main text" - Default window sizes (250-350 words) flagged 42.7% of papers; refined 100-word windows cut this to 12.7% - For context: FAccT 2025 papers showed 0% at maximum AI scores; NeurIPS E&D 2026 showed 2.1% The gap between the 42.7% default flag rate and the 12.7% refined rate is the calibration question this enforcement model leaves unanswered.

Potential risks and opportunities

Risks

Authors in the 123-paper notice cohort who cannot reconstruct pre-AI writing checkpoints by June 15, 2026 face desk rejection even if their AI use was fully policy-compliant
ICML, ICLR, and other top-tier venues may adopt Pangram at the same default 250-350 word window settings, scaling a 42.7% flag rate across the conference circuit before calibration is resolved
Chairs Alex Lu, Seth Lazar, and David Rugamer face credibility risk if post-appeal review reveals a substantial share of the 178 rejections involved only peripheral AI edits permitted under the track's own policy

Opportunities

Writing workflow tools that auto-generate timestamped pre-AI and post-AI version checkpoints would directly address the documentation burden NeurIPS now requires from appealing authors
AI detection vendors competing with Pangram (GPTZero, Turnitin, Copyleaks) have an opening to pitch calibrated, peer-reviewed alternatives to conference organizers troubled by the 42.7% vs 12.7% default-to-refined variance
Researchers studying AI writing detection can use NeurIPS 2026 Position Paper Track as a benchmark dataset: named methodology, real institutional stakes, and a controlled venue comparison against FAccT 2025 and NeurIPS E&D 2026

What we don't know yet

Whether Pangram's choice of 100-word refined windows versus the 250-350 word defaults was peer-reviewed or independently validated before it was used to decide binding rejections
How many of the 178 desk-rejected papers used AI only for peripheral copyediting within the policy's explicit permission, a figure NeurIPS has not reported
Whether NeurIPS will publish aggregate appeal outcomes after June 15, 2026 so the accuracy of the Pangram thresholds can be assessed retrospectively

Shared on Bluesky by 7 AI experts (top 5 by trust)

Mark Riedl @markriedl.bsky.social: NeurIPS statement on the use of AI in the position paper track blog.neurips.cc/2026/06/02/a... →
NeurIPS Conference @neuripsconf.bsky.social: This year, the NeurIPS 2026 Position Paper Track made the decision to require that all papers be substantially human-written, with AI used f… →
Suresh Venkatasubramanian @geomblog.bsky.social: Almost 30% of the submissions to the Neurips position paper track were deemed to be ai generated or heavily ai-implicated. blog.neurips.cc/2… →
Michael Ekstrand @md.ekstrandom.net amplified

Mark Riedl @markriedl.bsky.social

NeurIPS statement on the use of AI in the position paper track blog.neurips.cc/2026/06/02/a...
View on Bluesky →
Jason Moore @moorejh.bsky.social: I understand and respect this policy. The problem is that now you have enforce it. A properly constructed agentic AI workflow can produce te… →

Originally reported by blog.neurips.cc

Read the original article →

Original headline: NeurIPS Partners With Pangram AI Detector to Desk-Reject 18.4% of Position Paper Track Submissions — Methodology Criticized for Flagging Permissible Copyediting as Policy Violation