arXiv, PubMed Central harbor 146,900 AI fake citations
Key insights
- Researchers found 146,900 AI-hallucinated citations across four major platforms, far exceeding prior annual estimates of the problem's scale.
- Hallucinated citations mimic real paper metadata, making them structurally harder to detect than fabricated facts and resistant to automated filtering.
- The spread across arXiv, bioRxiv, SSRN, and PubMed Central shows the problem extends well beyond any single platform's moderation scope.
Why this matters
At 146,900 hallucinated citations, the citation graph that underlies scientific credibility has been contaminated at a scale that manual review cannot address. Peer review processes treat prior citations as validated prior work, meaning fake references embedded today will compound as downstream papers cite them. Practitioners building AI tools for literature review, drug discovery, or academic research now face an unquantified false-positive risk embedded in the training and retrieval data those tools depend on.
Summary
A new study counted roughly 146,900 AI-hallucinated citations across arXiv, bioRxiv, SSRN, and PubMed Central, putting a hard number on a contamination problem that has outpaced platform-level responses.
The detection gap is structural. Hallucinated citations mimic real paper metadata: author names, journal titles, plausible DOIs, so automated filters designed to catch factual errors miss them entirely.
Essentially: (arXiv, bioRxiv, SSRN, PubMed Central) each host portions of a problem no single platform can contain.
- arXiv's recently announced hallucination ban targets one pipeline; the same citations are spreading across three other repos simultaneously.
- Standard citation tools cannot flag references that look structurally valid but point to nonexistent papers.
At 146,900 fake nodes in the citation graph, hallucinated references have become a foundational integrity problem.
Potential risks and opportunities
Risks
- Drug discovery and clinical research teams using AI literature synthesis tools (Elicit, Consensus, Semantic Scholar) face unquantified risk that hallucinated citations have already seeded their training or retrieval corpora
- arXiv's announced hallucination ban could create a false sense of containment while bioRxiv and SSRN, lacking similar policies, continue accumulating fake citations at the same rate
- Publishers and institutional repositories relying on CrossRef DOI validation face reputational and legal exposure if hallucinated citations pass their screening and corrupt the outputs of publicly funded research
Opportunities
- Citation verification services (Scite, iThenticate, Retraction Watch) could expand into AI-hallucination detection as a distinct product category with clear institutional demand
- Academic publishers (Elsevier, Springer Nature, Wiley) have leverage to mandate citation-verification tooling as a submission requirement, creating a compliance market for reference-validation startups
- Startups building reference-validation layers for scientific AI workflows could position against the contaminated-data risk with database-anchored or cryptographic citation verification as a trust primitive
What we don't know yet
- Breakdown by platform not disclosed: whether PubMed Central's peer-reviewed subset shows a disproportionate concentration versus preprint-only repos like bioRxiv and SSRN
- Which AI writing tools or model families generated the bulk of detected hallucinations, and whether specific models show distinct citation-hallucination signatures
- Whether any of the 146,900 citations have already been cited by subsequent papers, propagating fake references into later literature
Originally reported by Phys.org
Read the original article →Original headline: Study: 146,900 AI-Hallucinated Citations Found Across arXiv, bioRxiv, SSRN, and PubMed Central