Cornell: 13 words on Reddit can poison AI search results
TL;DR
- Cornell researchers showed an attack called WARP can hijack AI research agents using as few as 13 words planted on Reddit.
- A single poisoned URL hits 38-51% mention rates conditional on exposure, and multi-URL targeting reaches 42-62%.
- AI agents cite user-generated pages in roughly 50% of queries, and r/biohackers moderators have already banned peptide and HRT posts.
Cornell researchers say it takes about 13 words on a Reddit page to make a deep-research AI agent recommend whatever you want it to recommend. The paper, covered by 404 Media and walked through in 404 Media's video on YouTube, calls the attack WARP, short for Web Agent Retrieval Poisoning, and tests it against three research-style systems: STORM, Co-STORM, and OmniThink.
The setup is uncomfortable for anyone using ChatGPT or Google's AI search as a real research tool. The team, Hal Triedman, Tingwei Zhang, and Vitaly Shmatikov, report that AI agents cite user-generated content in roughly 50% of all queries, and that nearly 25% of all citations come from user-generated sites. According to the arXiv preprint, a single poisoned URL of around 13 words achieves a 38-51% mention rate conditional on exposure; multi-URL targeting pushes that to 42-62%. In the full-content setting, the poisoned snippet can be less than 4% of retrieved content and still hold a 30-53% conditional mention rate.
The proof-of-concept examples in 404 Media's writeup are the part to sit with. A planted comment in r/austinfood pushing Sol Azteca for Mexican food. A fake dating app called SilverPath aimed at divorced men over 50. A peptide dose tracker whose creators allegedly seeded an r/Biohackers thread to get themselves cited. None of this required exotic tooling. There is already a cottage industry, with services like ReplyGuy marketing themselves as 'the AI that plugs your product on Reddit', and r/biohackers' moderators have already banned new peptide and HRT posts because of it.
The honest caveat is that these mention rates are measured against three specific research-agent systems, not consumer ChatGPT in every configuration, and the percentages should be read as conditional on the poisoned page being retrieved rather than as a guarantee on any given prompt. What the reporting does not give you is how the major chatbot vendors are detecting or filtering this in production, or whether Reddit and Wikipedia have any plan beyond reactive moderation.
The forward-looking read: if answer engine optimization is going to be the next SEO, the quiet edge goes to the brands running it before anyone agrees on what counts as cheating, and to the platforms and tool builders that wire poisoning detection into the citation pipeline before the trust problem becomes the story.
Shared on Bluesky by 2 AI experts
Originally reported by youtube.com
Read the original article →Original headline: - YouTube